THE ANNALS 
of 

MATHEMATICAL 

ATT^^TTP^s 

O JL ir\ 1 lO 1 

(Fints’or.ii HY u. c. cauveh) 

Thk Oi-’FifiAi. JcnjuNAij OF THE Inotitute of 

^Vl ATHEJ,rATI('AE STATISTICS 


VOLUME XX 


1049 




CONTENTS OF VOLUME XX 
AimcLKB 

Andeiison, T W, and Hehman Uubin, Estimation of the ParameterH of a 
Single, haiuatioii in a, (’oniiilele. System of Htoehastic Equations . 46 

Anduewh, V. (I. AND X. W. RiKKUAtiM. On Hums of Symmetrically ’’I’runcated 

Noniml Haiuloin Vurial)li‘s. 458 

Baker, O. A. The Variance of the Proportions of Samples Palling Within 

a Pixctl Interval for a Nornml Population. .123 

Baneiuee, K. H. a Note on Weighing Design. . 300 

BanchoI'T, 1’. A. Some Recurrence Formulae, in the Incomplete Beta Func¬ 
tion Ratio.451 

Bakankin, E. W. Locally Bi'st Enhiosed Estimates.477 

Berger, Agnbh and Aiuuhvm Ward. On Distinct Hypotheses. 104 

Birnb.mjm, Z. W. and F. Andrews. On Sums of Syminetrically Tnincated 

Normal Raudoin Variables. 468 

BiRNiiAxnM, Z. W. ANi) II. ,H. Zi'ckkrman. a (iraphical Determination of 

Sample vSize for Wilks’ Tolerance Limits . 313 

Beom, (R’nnak. a (teue.raliwRum tif Wald’s Fundamental Identity.439 

Boas, R, P,, Ju, Representation of Probability Distributions by Charlier 

Series. . . . 370 

Bosk, R, C. A Note on Fislier's Inwiuahty for Balanced Incomplete Block 

Designs .619 

CuERNOFF, IlEiiMAN, Asymptotic Studenfimtion in Testing of Hypotheses, 268 

David, Hbuhkrt 'I'. A Note on Random Walk... 603 

Door, J. L. Ileiiristie Approach to the Kolmogorov-Smirnov Theorems. . 393 
Dvciretzky, Arykh, On the Strong Stability of a Sequence of Events.. ,, 290 
Dwyer, Pax'D S. Pexiraoiiian Clorrelation Ckiefficicnta Associated with Least 

Squares Theory.404 

Epstein, Benjamin, A Modified Extreme. Value Problem . 99 

Epstein, Benjamin. The Distribution of Extreme Values in Samples Whose 

Members are Subject to a Markoff Chain Condition .590 

ErdOs, P, On a Theorem xif Hsu and llobbiiiB. 280 

Godwin, IL J. A Note on TCac’s Derivation of the Distribution of the Mean 

Deviation. 127 

Godwin, H. Jf, Some Low Moraenl« of Orxler StatisticK. 279 

Goodman, Leo A. On the Estimation of tlie Number of Classes in a Popula¬ 
tion . 572 

Greenwood, Rodert E. Numerical rutegration for Linear Suras of Ex¬ 
ponential Functions. . 008 

Giujbbs, Frank E. On Designing Single Sampling Inspection Plans .... 2‘l2 
Halmos, Paul R, and L. J. Savage. Application of the Radon-Nikodym 

Theorem to the Tlieory of Sufficient Statistics . 225 

iii 


















IV 


VOLUME INDEX 


Hansen, Morris H. and William N. Huiwitz. On the Delerminatinn of 

Optimum Probabilities in Sampling...- • < 

Hatke, Sister Mary Agnes. A Certain Cumulative IVohability Pune- 


tion . . 

Hobl, P, G, and R P, Peterson. A Solution to the Problem of Optimum 

Classification . .. ■ 

Horton, H. Burke and R. Tynes Smith, III. A Direct Method for Pro¬ 
ducing Random Digits in Any Number System ... ■ . - 82 

Howell, John M. Control Chart for Largest and Smallest Values - > . - 205 

Hurwitz, William N. and Morris H. Hansen. On the Determination of 

Optimum Probabilities in Sampling . .... 420 

Kimball, Bradford F. An Approximation to the Sampling Variance of an 
Estimated Maximum Value of Given Freciiioncy Bused on h'it of 
Doubly Exponential Distribution of Maximum Value.s . , . 110 

Lehmann, E L and C. Stein. On the Theory of Some Non-Parametric 

Hypotheses , .... ^ 

Lev, Joseph. The Point Biserial Coefficient of Correlation.125 

Levenb, Howard. On a Matching Problem Arising in Genet ic.H. 01 

Matern, Bertil. Independence of Non-Negative (Quadratic Forms in 

Normally Correlated Variables. 119 

Madow, William G. On the Theory of Systematic Sampling, II . 333 

McMillan, Brockway, Spread of Minima of Largo Samples.441 

Mood, A. M. Tests of Independence in Contingency Tables us rncondi- 

tional Tests . 114 

Noether, Gottfried E On a Theorem by Wald and Wolfowitz, . . , 455 

Olds, Edwin G The 5% Significance Levels for Sums of Squares of Hank 

Differences and a Correction . .. 117 

Otter, Richard. The Multiplicative Process.... ... 'iOO 

Paulson, Edward. A Multiple Decision Procedure for Certain Problems 

in the Analysis of Variance. 95 

Peisbr, a. M Correction to “Asymptotic Formulas for Significance Levels 
of Certain Distributions”.128 


Peterson, R. P. and P. G. Hoel. A Solution to the Problem of Optimum 

Classification .43,3 

Pitman, E. J. G and Herbert Robbins. Application of the Method of 

Mixtures to Quadratic Forms in Normal Variates.. 552 

Quenouille, M. H. Problems in Plano Sampling. 355 

Quenouille, M H, The Joint Distribution of Serial Correlutiun C’twf- 
^Icients .. 

Reich, Edgar. On the Convergence of the Classical Iterative Methuil of 

Solving Linear Simultaneous Equations.448 

Riordan, John Inversion Formulas in Normal Variable Mapping.417 

Robbins, Herbert and E. J G. Pitman. Application of the Method of 
Mixtures to Quadratic Forms in Normal Variates.552 




















VOUIME INDEX 


V 


Rubin, Hbiim\n a.nd T. W. Andeuhon. Estimation of the Parameters of a 

Single Equation in a Complete System of Stochastic Equations. 46 

Savage, L. J. and Paul R. Halmos. Application of the Radon-Nikodym 

Theorem to the Theory of Sufficient Statistics .. . 225 

Sard, Arthur. Smoothest Approximation Eormulas .G12 

Seth, G. R. Gn the Variance of Estimates. 1 

Sohel, Milton and Abraham Wald. A Seriuential Decision Procedure for 
Choosing one of Three Hypotheses Concerning the Unknown Mean of 

a Normal Distribution. 502 

Smith, II. Tyne.h, III and II, Rurkb Horton. A Direct Method for Pro¬ 
ducing Random Digits in Any Number System . 82 

Stein, C. and E. L. Lehmann, On the Theory of Some Non-Parametric 

Ilypotheaef, . 28 

Tukey, John W. SufRciency, Tnmeation and Selection . 309 

Tukey, John W. Momenta of Random Group Size Distributions.. 523 

VON SciiBLLiNo, HERMANN. A Formula for the Partial Sums of Some Hyper- 

geometric. Series. 120 

Wald, Abraham and Agnim Berger. On Distinct Hypotheses.104 

Wald, Abraham. Statistical Decision Functions . 165 

Wald, Abraham. Note on the Consistency of the Maximum Likelihood 

Estimate .595 

Wai.d, Abraham and Milton Subf.l. A Seciuontial Decision Procedure for 
Choosing one of Three Hypotheses Concerning the Unknown Mean of 

a Nomial Distrihution. 502 

Walsh, John E, Sumo Significant 'IVsts for the Median which are Valid 

Under Very General Conditions. 64 

WAi>m, John E. On the. Range-Midrange Test and Some Tests with 

Bounded Significance Levels....... 257 

Wawh, John E. On the Power Funetion of tlie "Best” t-test Solution of the 

Bohrens-Fiaher Problem .616 

Walsh, John IC. Concerning Compound Randomization in a Binary Sys¬ 
tem . 580 

WoLEOwm, J. On Wald's Proof of the Consistency of the Maximum Likeli¬ 
hood Estimate. 601 

WolfowttZ, j. The PoM'cr of the Classical Tests Associated with the Normal 

Distribution... 540 

Woodbury, Max A. On a Probability Distribution.311 

Yohida, KOsakh, Brownian Motion on the Surface of the 3-Sphere. 292 

ZucKEUMAN, IT. S. AND Z, W. BiRNiiAHM. A Graphical Determination of 
Sample Size for Wilks’ Tolerance Limits.313 


Miscellaneous 

Abstracts of Papers...130, 317, 464, 620 

Constitution and By-Laws of the Institute.327 































VI 


VOLUMK INDKX 


Election of Officers and Council andllevision of By-LaWH, 

News and Notices. 141, H'JL 

Report on the Berkeley Meeting of the Institute. . . . , 

Report on the Boulder Meeting of the Institute. 

Report on the Cleveland Meeting of the Institute. 

Report on the New York Meeting of the Institute. 

Report on the Seattle Meeting of the Institute. 

Report of the President of the Institute.. .. 

Report of the Secretary-Treasurer of the Institute. .. 

Report of the Editor.. . 


.... 1.10 
■17(1, 024 
47.'-) 
020 
l.W 
1125 
1.11 
. Lin 
100 
1(!3 








ON THE VARIANCE OF ESTIMATES 

By G. R.. Seth 
Columbia Umversity 

Summary. In tliis paper reeent resultH on the lower bound to the variance 
of unl)ia8ed e.stimatPH have been brought together. Some of them have been 
extended to aequential ehtimate.H and the others have been improved to some 
extent. In the In.st section a general method for generating a system of orthog¬ 
onal polynomials with res])eet to a certain class of weight functions is obtained 
together with a lesult on the. conditions under which the class of unbiased esti¬ 
mates formed by all fimction.s of an unbiased estimate consists of just one element. 


1. Introduction. 

§1.1. Let , -V-i • ■ • he a sequence of chance variables whose distribution 
depends upon an unknown jiarameter B and possibly also a finite number of other 
parameters. It is aasuincd that either all the X’s are absolutely contmuous or 
that they’^ are all discrete. lait Pm(x\ ,Xi , • • • , ; 0) denote the joint probabil¬ 

ity density function or the probability of (Xi , • • ■ , -Y^) according as the Y's 
are continuous or discrete. Let , Xs ,•••,»„) be an unbiased estimate of 
B, where Xi, xs, • • • , x„ is a sequence of observations on Yi, Yj, • - - , Y„. 

In this paper, we shall make use of the following short forms and abbrevia¬ 
tions: 

E{X) will represent the expectation of X. 

<t\X) will represent tlie variance of Y. 

E{y 1 x) will represout the conditional e,xpectation of y, given x. 

6* will represent an abbreviation of , Xi, • • • , x„). 
f will represent an abbreviation of/(x; ff) or/(x; , • • • , ffr). 
p„ will represent an abbreviation of p„(xi, Xj, • • • , x„ ; 0) or p„(xi, xj, • • • , 
Xn ',8i, Bi, • • ■ , flr). 

pN will represent p„ for a fixed size sample, i.e., n « Y. 
g will represent an abbreviation of g(p*i 6 ) otg(d\f flj, • • • , 6 *^ 61 , 6 %, ■ • • j^r). 
h will represent /i(ti, , • • • , tw-i | '$*; 8 ) or ii(fi, { 2 , • • > , fy-r | 8 ^, Bt, , 

ehdi.Oi, 8 t). 


will 


rcprcsent 


Pn 


^u+Jj+'"+<r 

'defdof^^^ ’ 


ffU'Ui 


gM + <»+>”+<r 

will represent - --- 

0 aBi^ddi*---dB 


ir (7* 


J gq+fjl+ ' 

fin.iir-.tT "'ill represent y 


Hr 


hddi^ 


r.h. 


In case differentiations with respect to one parameter are involved, the last 
three abbreviations will be shortened to <#>((«), gt and h, respectively. 

1 



2 


G. R. SETH 


In §1.1,nis assumed to be a constant equal to N, that is, tlip.scqueiirc of chance 
variables is finite and fixed, consisting of Ai, , Aj, • • • , .Y.^ , 

Cramer [1] and Rao [2] have shown that under certain conditions of regularity, 
the variance of e*ixi , xa, • • , xn) satisfies the inequality: 


( 1 . 11 ) 




i 



Cram6r [1] has shown that the lower liound for the variance of 
0*(xi, * 3 , • • ■ , given by (1.1.1) is achieved if and only if: 

(1.1.2) . There exists a sufficient statistic for estimating 6. 

(1.1.3) . The probability distribution g(6*; d) of the sufficient statistic 
ff*(xi, xi, • ■ ■ , zjir) is of the form 

d 

Xi, - ,xn) - 0 = ''’henevGr gid*; B) > 0, 


where K depends only upon iV and the parameters in the di.strit3Ution. 

Cramer calls the statistic d*(,xi, Xi, • ■ ■ , Xn) satisfying (1.1.2) and (1.1.3) 
an “efficient” statistic estimating d and ive will use the word “eOicieut” in this 
sense alone Bhattacharyya [3] has shown that there exists a lower bound to the 
variance of e*ixi ,Xi, ■■ ■ ,xn) which is higher than or equal to the one given, in 
(1.1.1). This lower bound is that is, 

(l.l.'i) (r\e*ixi , Sj, • • • , Sat)) > („)X" 


where 

and 

(1.1.5) 


iiwx"ii = i!x./ir\ 

'J_ ^ d'pA . . 

66' dd’ hJ ~ , m, 


where m is any positive integer. 


Let 6 consist of^T components 0i, 03 , • • • , 0^ , and , Xs, • • ■ , Xa- : 9 t ) 

be the same as H /(x., 01 , 02 , Or). Further let e*(xi, Xt, • • • , xa,), 

02 (xl , xj, • • ■ , aijv), • • , dfixi , X 2 , • • • , ®Ar) be Unbiased estimates of 0i, $i, 
■■■ , 0T respectively, ivith the non-singular covariance matrix |j Vii\\ 

(hj = 1,2, •. , T). Cram6r [4] has proved that under certain regularity con¬ 
ditions, the ellipsoid 

E = T + 2 


contains within itself the ellipsoid 

,S J.M -T + 2, 



■where 

( 1 . 1 . 8 ) 

and 


VAHIANCE OF ESTIMATES 


3 


II ir = II II, 


(1.1,9) 



This result is also implicitly contained in Rao [2). 

§1.2. Let U3 now take n as a chance variable determined by a sequential pro¬ 
cedure. Xi, Xi, X 3 , ■ • is a sequence of chance variables having the same 
probability density or probability/(a;; 8 ), according as X is absolutely continu¬ 
ous or discrete. The sequential process tells us, after each successive observa¬ 
tion has been drawn, whether the next observation is to be taken or not. Thus 
n will denote the total number of observations taken by the time the sequential 
process has been completed. Under certam regularity conditions, Wolfowitz 
[5] has shown that if 6 *(xi, xi, • • , a:„) is an unbiased estimate of 0 , then 


( 1 . 2 . 1 ) 


o' d'^(^xi j x^ j • • * jXji) ^ ^ s a. 

En-E{J-^ log/(x;0)j 


Furthermore, if 8 consists of T components, 61 , 82 , dr , and 

dtixi, X 2 , • • • , x„), 0?(xi, xj, • • , x„), • • • , 8 t(xi , x 2 , ■ ■■ , x„) are unbiased 
estimates oi di, 62 dr respectively, Wolfowitz [5] has proved that 

(1.2.2) E = T + 2 

v,)-l 


is contained within the ellipsoid 

(1,2.3) V'’Ui, = T + 2, 

i,,-i 


where 


7„ = En-E 


' d log / a log A 

^ ae. dB, J’ 


J = 1 , 


Blackwell and Girshick [6] have shoivn that the lower bound given by (1.2.1) 
for the variance of an unbiased estimate of 8 is attained only for the sequential 
process for which Pr(n = iV) = 1, if the probability density function f{x', 8 ) of 
X is such that E{X) — 8 and Xi -f aij + aij, ■ • • + aijo is a sufficient statistic for 
all integral values of M, for estimating B\xi, X 2 , ' • • , Xu being M independent 
observations on the chance variable X. 

In this paper the following results have been obtained. The specific condi¬ 
tions under which the results hold are stated at their proper places along with 
the results; 

(1.3.1) The lower bound in (1.1.4) is valid when n is considered a 



4 


G. H. BETH 


chance variable determined by a sequential procedure instead of bfdnp a fixed 
number N. 

(1.3.2) The concentration ellipsoid defined in (1.2.3) contains within itself 
another ellipsoid 

23 = T + 2 


where n,j is given by (3.1.18), which in turn contnin.s the ellipseitd given by 

(1.2.2). 

(1.3.3) . The Blackwell and Girahick result [G] for the achievement of the lower 
bound for the variance of unbiased estimates given by (1.2.1) has been e.xtended 

to the case where the probability density (or probability) JX fix,- ; 0), for all 

1-1 

fixed M> N, where N is the least value for which Pr(ii = N) 0, has an 
unbiased “efficient" estimate for 8 in the sense defined by Cramer. This is 
illustrated hy two examples of Wald sequential proeedurea. 

(1.3.4) . Let N be fixed and p^C-Ci, , • ■ ■ , ; 0) • | / j « p(0*; 0) 

, U-i I ^*1 6), where J denotes the Jacobhin of the transfor¬ 
mation from ii, zj, • •. , zv to 0*, . Here £r(0*; 0), and 

I b) "• > iif'-i I 9*-, 0) are respectively the probability density function (or 
probability) of 6* and the conditional probability density function (or prob¬ 
ability) of ^ 1 , , • • • , ^v-i for a given value of 6*. 

The necessary and sufficient conditions under which tlie lower bound for the 
variance of unbiased estimates given by Bhattacharyya [3] may he achieved are 
that there should exist a statistic 0*(zi, Zj, • • ■ , z^) such that: 

{&} hi, h, hm are linearly dependent considered as functions of , 
b, ,h-i for given values of 8 and B*ixi , Zj, • • • , z.v) and 
(b) the probability density p(0*; 0) of 0*(zi, z,, ■ • • , z^) satisfies Uic follow¬ 
ing equation: 


9 (Zi, Z2 , • ■ ' , Z/y) 


y Ki ^ 

^1 0 ( 0 *; 0 ) 00 ‘ 


0(^*i0). 


where K, are independent of the Zi, Zj, zs , • • • , za- . 

Equivalent conditions for the multiparameter case have also been given. 

13.5). The following properties of <^i(ii),,^(n), are derived: 

(a) Under certain conditions (j>i{N), MN) • • • form a system of orthogonal 

polynomials m 4>i{N), the weight function being , zj, ■ ■ ■ , ; 0). 

(b) gK.^.(n) cannot be a function of Zi, z,, • ■ • , z„, indopendont of 6 


except for the constant zero. 

dependent upon «,(a), then no other 

clnlt . + b where a and b are 

fl 3 Tf 1 <=^11 be linearly related with 4«i(n). 

^ It a) 0 (zi, zj, •.. , zv) is an unbiased estimate of 9 and b) if among 



• VA.RI-VN'CE OF ESTIM VTEfi 


all functions of 6 *{xi , Z 2 , ■ • , a;.v) Avhich are unbiased estimates of 6 with finite 
variance, 6 * is the one with the least variance and such that the set of poly¬ 
nomials with respect to the distribution function of 6 * is complete, then there is 
no function of 6 * having a finite variance which is an unbiased estimate of 6 . 

2. Estimation of a single parameter. 

§2.1. Let Xi jXi, • ■ and pmixi , a-«, ■ • • XAf ; 0) be as givmn in the fir.st para¬ 
graph of (1.1). Let fi be the space of all po-ssible infinite sefiuenee-s (tu) of obser¬ 
vations Xi, Xi, ■ ■ ■ . Let there be given an infinite sequence of Borel measur¬ 
able functions <[> 2 (xi, X 2 ), ■ ■ • , , X 2 . xi, ■ , x,) • ■ ■ , defined for 

all observable sequences in such that each takes only the values zero and one. 
We further assume that everywhere in Q, e.vcept possibly on a set whose proba¬ 
bility is zero for all 6 under consideration at lea.st one of the functions $i(ri), 
$ 2 (.'Ci, X 2 ), • • ■ takes the value of one. Let n be the smallest integer for which 
this occurs. Thus n(w) is a chance variable. The sequential process is then 
defined as follows: 

Take an observation and find$i(xi). If it is unity, the sampling process .stops; 
otherwise continue sampling If a second observation is taken and the value of 
, 3 : 2 ) is unity, the process stops; otherwise continue sampling, and so on, 
In general, if after taking j observations 

®i(a:i, 12 , • ■ • , X.) = 0 for 1 ’ = 1, 2, • • ; - 1, 

and 'I>,(xi , X 2 , ■ • ■ , X,) = 1, sampling stops; otherwise it is continued. We 
will denote by R, , the set of all points (a:i, X 2 , • • • , x,) for which tlie proc iss 
stops Avith the jth observation 

Let 6 *{xi , X 2 , • ■ • , x„) be a statistic whose expectation is a real valued func¬ 
tion y{d) of 0 . The development proceeds on the assumption that 
VM(,Xi ., X 2 , ■ ■ , Xif; 5) is a probability density function. The result is equally 
valid if Pm{xi , X 2 , • • • , Xa, ; 0) is the probability of discrete variables Xx , 
.X’ 2 , ■■ ,Xm provided that integration is replaced by .summation whenever 

this is required. Further the phrase "almost all points” in a Euclidean space of 
any finite dimensionality is understood to mean all points in the space with the 
following possible exceptions: 

(a) . A set of Lebesgue measure zero wheie Vm^xx , X; , • • • , x.w; 0) is the prob¬ 
ability density function; 

(b) . The points which belong to the set Z, wliere Vm{xx . Xa, • ‘ , Xa, ; 0) i.s the 
probability function of the discrete chance varialiles A'l, A' 2 , • • ■ , Xat . The 
set Z consists of all points (xi, xi, • • ■ , Xu) such that p.u(.ci, X 2 , ■ • ■ , x« ; 0) « 
0 identically for all 0 under consideration. 

§2,2. Cmdihons of regularity. We will postulate tlie following conditions to 
be satisfied by paixx ,X 2 , ■■■ ,xm ; 0) and 0*(xi, xa. • ■ • , Xn). 

(2.2.1) 0 *(xi, X 2 , • • ■ x„) has an expectation 7 ( 0 ) and a finite variance. All 
the derivations of 7 ( 0 ) are assumed to be finite. The parameter 0 lies in an open 
interval D of the real line D may consist of the entire line or an entire half line. 



6 


G. R. SETH 


(2.2.2). The derivatives 

d'Vit 

dd' ■ 


(i = 1, 2, • • •, m), 


exist for all 0 in D and almost sMzi ,Xt, ■ • • , Xm in Ru and for all M. \\ e define 

J_ 9'pu _ Q 
p^/ ’ 

whenever pjif{xi, , • • • , Xm S) = thus, 

- ™ 

Pm 06 ' 

is defined for all 0 in D and almost all (xi, xi, • ■ • , Xu) in , 

(2.2,3). For any integral j there exists nomnegative L-measurahle functions 
,T,{xi , Xs, • • ■ , X,), (t = 1, 2, " • ■ , m), such that 


JCj), 


5* 

(a) 1 0*(xi , Xi, ■ ■ , x,)-0p,(xi, Xi, ■ ■■ , X,; B) \ < T.(xi , Xi, • 
for all 0 in D and almost all (ai, 3:3, • • • , Zf) m R,, 

(b) f Tt(zi t ^2) * * * j iCj) dzu ) (t ” ] i 2, *»* j 7?i)j 

u**! 

are finite. 

(2.2.4) . Let i,(d) = f e*{xi, xj, • • • , x,)p,(xi, Xi, • • • , X/; 0) H rfx«. 

•>n, „-i 

We postulate the uniform convergence of 

£ (t = 1, 2, • • •, m) 

d* 

(the existence of — (t,(0)) is assured by the assumption (2.2.3).) 

(2.2.5) . There exist functions »S,'(.Xi, X2, ••• , x,) for every 7, (i >= 1,2, * ,m), 

such that when 0*(x, . xj, • ■ ■ , x,) and T.fxi, xj, • ■ • , x,) are rcplami by 

unity and /S.(xi, xj, • ■ • , x,) respectively, conditions (2.2.3) and (2.2,4) still 

hold good. 

(2 2,6). The covariance matrix of Mn) {i = 1, • • • , m) exists and is non- 
singular for almost all 6 in D and almost all (xj, Xj, • • • , x„). 

f consider the sequential process mentioned in §2.1 and the func- 

’ *Vn 'r>' ’-.TT^ ‘■ ,xm', 0) wliich satisfy the regularity 

conditions in §2.2, We will now find a lower bound for the variance of such es- 
timates. 

Let us examine 

(2.3.1) p = B , X2, ... x„) - y(e) - g KMn)J, 



VARIANCE OP ESTIMATES 


7 


where JC, (i = 1, 2, • • • , m) are independent of (.xj, xa, • • x„). Now (2.3.1) 
can be written as 


(2 3t.2) 


F = ,xu ■■■, .T„)) - 2 £ K,Ed*(xi ,X2, , Xn}<P^(n) 


+ 2y(B) S K,E<t>iin) + 53 K^KjXii, 

1-1 i,)“i 


where 

Now 


Xi; ~ j 1 ) ‘ ‘ > ^)" 

(2 3.4) E(e*(xi,Xi,“-,xJ<l)Xn)) = '^ d*(xi, Xi, ■ ■ ■, x,) lidXu • 

J"»i >JRj vv U"“l 

We also know that 

(23.5) Z) f 0 *(xi, xi, •••, x,)p, n dxu = y(6). 

j-l JRj u-1 

Differentiating both sides of (2 3 5 ) i times (f = 1, 2 , ■ > * , m) we have, be¬ 
cause of conditions ( 2.2 3) and ( 2 . 2 , 4 ): 

(2.3.6) Z , X2, • • •, Xj) n dxii = , (f = 1, 2, • • •, m). 


lUi 

From (2,3.4) and (2,3.6), we obtain 

(2.3.7) E(e*{xi, Xi, • • •, x„)i>i(n)) = yid). 

Differentiating 

(2.3.8) 1 = 2 f P,tldxu 

j-i Jr, u -1 

i times (j = 1 , 2 , • ■ • , m) with respect to d, we obtain because of conditions 
(2.2.5) 

(2.3.9) 0 = E f ^-l^Ildxu, (f = 1, ■ • •, in). 

j -1 JRj UU^ W-I 

(2.3 8 ) is valid on account of the typo of sequential process (2.1), Now 

(2.3.10) iB(0i(7i)) = E f ii =• 1,■ ,ni). 

7-1 JRi w-1 

By (2.3.7) and (2.3.10), (2.3.2) reduces to 

(2.3.11) F = Ae*{xi, Xi,---, Xn)) - 2 E + E ififf/X., . 

• -1 1 . 7-1 

Now II Xi/ II being non-singular on account of condition (2.2,6), we get just 
one set of values of K’s which minimize F. These values are given by 

TT _ V -,ijd'y(d) 

~dr ’ 


(2.3.12) 



8 


G H. SETir 


where 

(2.3.13) llwX"|r‘ = |jX„||, (,'j » 1, 2, 

Putting the above values of KjU = 1,2, m) in (2.3.11), we obtain 

(2.3.14) F = xn)) - t, . 

i.j-i do' a0> 

Hence, F being non-negative by (2.3.1), we have 

(2.3.15) ,W*., X., ..., X.)) > t . 

Thus R.H.S. of the above inequalitj’- gives the lower bound to the variance c 
unbiased estimates of y(B) ^ When y(e) = 6 , the above reduces to 

(2.3.16) <7^(fl*(2:i x„)) > 


n 

When m = 1 and p„(x,,X 2 , ■■■ ,x„ ;B) = HAx, ; 0), (2.3.1(5) reduc&M to 


(2.3,17) -...ajJ) > 


_i 


which is the result given by Wolfowitz [6]. 

When n, the chance variable, is constant and equal to N, then (2.3.15) and 
(2.3.16) correspond to those given by Bhattacharyya [3]. Although the con¬ 
ditions of regularity under which Bhattacharyya proves his results are not dear 
from his paper, they are likely to be slightly different from those in §2.3, as the 
results in [3] are obtained only for a fixed size sample. 

„k necessary and sufficient conditions under 

which tie lower bound given in (2.3.16) i, actually higher than that given in 

We can easily see that 


(2.4.1) 


(m)A — 


X„(l 


2)' ? 0 jn)**"" correlation coefficient between Un) and 01,(a), 

is ^t^bT' 1 


(2.4.2) 


^ _ 1 
Xu(l — ii!ij3...„) 


estimates has been obtamed^^mralong^theTn'^s^ the variance ol unbiased 

m an unpublished paper by A. Walti IndenenUp ii for fixed she samplefl 

in a paper not yet published ^ “ 7 C Stem has obtained the same result 



which is further equal to 
(2.4 3) 


VAEIANCE OP ESTIM.VTES 


9 


Rl 83- m 

^ll(l — 23 -m) 

Thus the lower bound for the variance of unbiased estimates of 0 is obtained 
by using to > 1 is higher than that obtained by employing to = 1 if and only if 
fii 83 m is not zero for some m > 2. This is equivalent to the condition that for 
at least one i > 2, Xi., the correlation coefficient between and^.(n) (i > 1), 
is different from zero. Suppose further that we have used m = a and that we 
wish to find the increase in the lower bound if a were replaced by a + 1. Tlie 
increase in this case is given by 

2 

(2.4.4) Plf»+1).23 g 

— Rl'ZZ -(o+l)) 

where pi(a+i).23 a is the partial correlation coefficient between 01 ( 71 ) and 0a+i(n) 
keeping 02 (n), • ■ ■ , 0 a(a) fixed. It is greater than zero if and only if Pua+u-u.-.o 
is not equal to zero. 

§2 5. If Pn(,zi, Xi, ■ • , x^ , &i) also depends upon a finite number of other 
parameters 0s, 03 , • ■ • , Sr , then a lower bound higher than or equal to that 
given in (2 3.16) can be obtained by using 

m 

,fj,(n) instead of 2 iff,,-0,-,( 71 ) in(2'3'l). 

11 +M+' • +>T'Sm il-l 

The lower bound in this case is given by (3.1.14) (see section 3) by taking 9=1, 
that is, 

(2,5 1) , Xi, ••• ,x„)) > C(l, 1) 

where C(l, 1) is the element in the first row and first column of the inverse of W 
defined in (3.1,9). 

The result for ?i = N, N fi.xed, is obtained by Bhattacharyya [3, 1947], Let 
us illustrate it by an example Take samples of fixed size N. Suppose we are 
required to find the lower bound to the variance of unbiased estimates of 0i 
in the normal population 

(2 5.2) f(.t^e,6i) = 

on the basis of N independent observations .ti , a: 2 , ■ • ■ , Xh . The lower bound 
for the variance of the unbiased estimates of 9i, when we use 

Z) /v.i • 0,i(A'') in (2 3.1) is given by ~ . 

>1—1 iV 

However, if Z • 0ii,ti(iV') i.s used, the lower bound, by the help of 

' 1 +' 2~2 ^ 

(2 5.1), is found to be equal to 20l/(iV — I) In fact there exists the statistic 

Z (.7;, - .f)^ 


N - 1 



10 


&. H. SETH 


whose variance :s equal to 26i/(N — 1) where x = 2^ . Ihvis the use of 

»~1 A 

X) brings into relief the unbiased estimate with the 

least variance. 


3. Multi-parameter case. In this section we will prove the re,sult mentioned 
in (1.3.2) of §1.3. 

§3.1. Let 6 consist of T components (6i, Si, • • • , 9 t) and ot, 8* , • ' , 0* be 
unbiased estimates of di, 8i dr respectively. Also, let a sequential process 
of the type described in §2.1 be given. We postulate the follounng regularity con¬ 
ditions: 

(3.11). The covariance matrix || |I of tha estimates 0*{i ^ I, 2, • • • , T) is 
nm-singidar in D, where D is an open interval of the T-dimensional parameter 
space. 

(3.1.2) . The conditions of section (2.2) are satisfied for each one of 
d*{i = 1,2, ■■■ ,T) and .„,>(«), (ii + ii + • • • + fr < m). 

(3.1.3) . The covariance matrix of </).,.<,.n + fa + • • • + it < nt exists 

and is non-singular. Under the assumptions (3.1.1)-(3.1.3), we prove the result 
(1.3.2) in section 1.3. 

Pkooe: Using the same arguments of §2.3, we obtain 


(3.1.4) Eie^ixi,X2 I ‘ ‘ ‘ > .,i'j.(w)) 

(3.1.5) 


^ jfn = = 1,2, *..,3’), 

’ 1; “ 1.2, • • •, y 

0 otherwise. 


Let the covariance matrix of ^^(i = 1, 2, • • • , s; a < T) and . <r(n), 

(*i + fa + ■ ■ ■ + fr < m) be given by 


(3.1.6) 

where 


A 

B 

B' 

W 


^ = I|F.,|1. f,i = l,2. 
(3.1.8) B = 1|/,0 1|; 

(3.1 9) and W = covariance matrbe of the set 


, s; S < T; 


. iirWjfi + fa + •' ■ + fr < m], 

arranged such that the ;th term in the leading diagonal is given by 

(3 1.10) B(<^i,„j. ,j,(ti)), where f, = 1, f^, = 0, ,9 4= j, (j * 2, •«• , T), 

and B' is the transpose of B. 

As U IS positive semi-definite, we have 
(3.1.11) 



VAMANCE OP ESTIMVTEri 


U 


The above can further be reduced to 

(3.1.12) I 17 I • I A - I > 0, 

which leads to 


(3.1.13) 1 A — B • W~^ • JS' 1 > 0, as W is positive definite. 


By the use of (3.1 8) we obtain from above 

(3.1.14) |A-C1>0 

where C is the top left part of consisting of s rows and s columns. 
Let us now consider the matrix 


(3.1.15) II 7./ - w/ II, a, j = 1, 2, ■. • , T), 

where || u„ || is the topleft part of 17“^ consisting of T rows and T columns, and 
is equal to 

(3.1 16) II T7u - WuWTzWn ir\ 


when T7 is written as 
(3.1.17) 


T7xi 


T7,i 

w,, 


where T7ii has T rows and T columns 

By the repeated application of (3.1.14), we are led to the conclusion that all 
the leading minors of the matrix in (3.1.15) arc either positive or zero. Hence 
the matrix m (3.1.15) is semi-positive definite. 

If now we put 

(3.1.18) IIm.,11 = 

we obtain 


(3.1.19) II 4.^-7" II 
to be semi-positive definite. Thus the ellipsoid 

(3.1.20) £ = T -f- 2 


contains within itself the ellipsoid 

(3.1.21) = T “H 2. 

Cramer calls the ellipsoid in (3.1,20) a "concentration" ellipsoid. 

Wo will now show that the ellip.soid given by (3,1.21) contains within itself 
the ellipsoid 

(3.1.22) D = T + 2 

i,)"! 



12 


G. H SETH 


.her, II I., II is a. infomstioB nuts, siven ly Wu in (3.1.17). Wn will provo 

the above by showing 

( 3 . 1 . 23 ) II II, (i, 

to be semi-positive definite. 

We obtain, from (3.1.16) and (3.1.18), 

(3.1.24) 11 11 “= ~ TTiaWTs Wn , 

From the above it follows that 


(i,i •“ li ■ t 2’). 


(3.1.25) 


\L, - 


liii 1 


WiiW sa^TFsi • 


Thus the matrix on the right hand side is semi-positive definite since is 
positive definite, we see that the ellipsoid (3.1.21) contams withm iteclf the el ip- 
soid given by (3.1.22). This proves the assertion made in (1.3,2) of It 

may be seen that (3.1.22) is strictly contained m (3.1 21) d and only if + 0. 
It my be mentioned that in this section as well as elsewhere, T -h 2. appearing 
on the right hand side of the equation of an ellipsoid, can be replaced by any 
positive constant. Also the ellipsoid in (3.1 21) depends upon tlio choice of m 
and it can be shown that for any two positive mtegei's mi, iih (wh > w,) the el¬ 
lipsoid for m - rth contains within itself the one for m = mi. 

§3.2, In general, let e*(a:i, , • • • , ain) be statistics whoso expectaliori.s are 

7 ,(fli, , «r), (f = 1,2, • • , T), the latter being assumed to admit partial 

derivatives of all possible orders. Under the postulates enumerattHl in §3.1, 
we see that the ellipsoid in (3.1.20) contains within itself the ellipsoid 


(3.2.1) 


E Si,= r + 2 


where 

( 3 . 2 . 2 ) 

and 

(3 2.3) 


s,, 11 = 11 ir 


i-t 


i, j =* 1, 2, 


T, 


R = 


3*l+‘2l->■•7' I 

(y = 1,2, • • • , T; Ji -f- fi -f- • ■ • + tV m), 


where j and -f- ii + • ■ + Jr indicate the number of the row and tlic column 
respectively and is arranged to correspond to the arrangement of IK, U'here IK 
is the same as given in (3,1.9). 


4. Achievement of the different lower bounds. In §4.1 we will ciemon.strate 
the desirabihty of finding a higher lower bound to the variance of sequential 
estimates than that given by Wolfowitz, by giving two examples in which the 
latter is not achieved From §2.4 it is clear that this will be so if .B(<^i(n) • <#',(n)) 
is not zero for at least one value of f > 2. We will demonstrate that tills is true 



VARIANCE OF ESTIMATES 


13 


for z = 2. In §4 2 ive shoiv that if “efficient” statistic exists for all M > N, 
the bound is achieved only in the case when the sample size is fixed. In §4,3 
we obtain necessary and sufficient conditions for the attainment of the bound 
given in (1.1,4), In §4 4 we discuss the conditions under wliicli there exists a 
“concentration ellipsoid” which coincides with the ellipsoid given in (3.1.21) for 
samples of fixed .size N. 

§4.1. Ex. 1. The Wald sequential procedure for testing 0 = 0i , against 9 « 
9} in a normal population 

(4.1.1) 

IS given as follows: f/ 

(4,1 2) B<i (x. - < A, (s = 1, 2, . ■ ■ , j - 1), 

and 

(4.1 3) ^ is either > A or <B, 

we ccasc sampling and make a decision. Here .,4 and B are constants fixed by the 
probability levels of making a correct decision 

Let us denote the set of points satisfying (4.1 2) and (.1.1,3) by Rj . In this 
case 

(4.1.4) 0i(n) = Z) (ai. — 5) = — nO, where /?„ = Z Xi- 

1-1 ,-j 

The above is differentiable ivith respect to 0. On differentiating we have 

(4.1.5) <^ 2 (a) = {Z. - nef - n. 

Now 

(4 1.6) E(0i(n) . 0.(71)) = E{Z„ ~ ndf - E{n[Z, - nO)). 

By theorem 7,3, Wolfowitz [5], 

(4.1.7) E(Z,, - ndf = En - E(X - df + 3E{n{Z„ - h0)), 

where -Y has the distribution given in (4.1.1), As E{X - Bf is equal to zero, 

(4.1.6) reduces to 

(4 1.8) E(.Un) • 0a(7i)) = 2E(ii(2„ - n0)). 

We will now show that right hand side of (4.1,8) is not identically zero in 9 , 
Let us consider 

(4.1.9) EM . 11 [exp (-J i; (X, - .)')] ff ix.. 



14 


G. a. SETH 


Differentiating with respect to 6, we get 

<4.LI0) i («„)) - 11 M .[oxp (- J i fe - .)■)] n .... 

The righthand side of the above equation being equal to Ein{Zn - • n0)), the lat¬ 
ter does not vanish identically in 6, because the lefthand side is not identically 
zero. The step from, (4.1.9) to (4.1.10) can be easily seen to bo valid. 

Ex. 2. The Wald sequential procedure for testing p » pi against JJ *« ps in a 
binomial distribution, where p is the probability of the event occurring, I.h given 
as follows: If 

(4.1.11) B<tixi-d)<A, s =, 1,2, ,j - 1, 

l—l 

and 

) 

(4.1-12) XI (a;* — d) is either >A or <B, 

where d is given hy [log (1 - pO/H - p 2 )l/log [(ps(l ^ p,)/pi(l - Pa)], the 
process slops with the jih observation and a decision is taken. Here, X( ia the 
characteristic function of the event at the ith trial, that i.s: 

a:^ = 1, when the event occurs at the fth trial; 

= 0, otherwise. 

Let us denote the set of points satisfying (4,1.11) and (4.1.12) by /?/. In this 
case we find 


(4.1.13) EMn) •.i^(n)] = . Ein(Z„ - np)), 

n 

where Z„ = X an . We have now to show that the righthand side is not iden¬ 
tically zero. Differentiating 

(4.1.14) E{n) = X X iP^'(l - p)’~^‘ 

7—1 Sf 

with regard to p, we obtain 


(4.1.15) 


The righthand side of the above is the same as 
(41.16) 1 


P) 




B{n{Z„ — np)). 


p(l — p) 

Thus ae Wthmd side of (4.1.15) being not identically eero, the same is true tor 
(4116), and conee,nentiy the bound given by WoWte ie'not atSeCd 



VAMANCK OP rSXrXfATES 


15 


The step from (4.1.14) to (4.1,15) i.s valid as 

(4.1.17) 

is absolutely and uniformly convergent. 

§4,2. Let d* be some unbiased estimate of 0, where .r.’s a’-e successive inde¬ 
pendent observations on the chance variable X having the probability density 
function or probability function/fa” Q). We adopt a .sequential procedure men- 
tioned in §2.1 satisfying the regularity conditions in §2.2 and also postulate 
the following: 

(i) For all positive integral values of M > N 

PAf(ai, a-2, ■ • • , .Tv ; e) = H f{x, ; 6) 


possesses an ‘efficient’ estimate for d, where N is the least value of n for which 
Pr(n = W) =(= 0. 

(ii) B{n) exists and admits derivatives up to the second order with respect 
d 

to 0. Furthermore, ^(.S(n)) is either zero for all 6 under consideration or is 
never zero. 

Under the above conditions the Wolfomlz lower bound for the variance of mibiascd 
estimates is achieved only when Pr(n = N) = 1. 

Proop: This bound will be attained if and only if there exists an unbiased 
estimate d* of 0 such that 


(4.2.1) 
that is, 

(4.2.2) 


E{6* - e - K4,i{n)f = 0, 


6* - 6 ^ K<t>i{n) 


with probability one, where K is independent of all T(’s and n. As there exists 
an ‘efficient’ estimate, say 4/{M) for all M >N, we have 


(4.2.3) 


'PiM) — 0 = 


'S[(|log/(T;0)J 

for all M > N. From (4.2.2) and (4.2.3), it follows that 


(4.2.4) 
Now as 


0* - 0 = JC . n • ifin) - 0) . F I {~ log/(a:; 


_(~\ogfix-,6)y. 


K = 


En . F[(^log/(a;0)J]’ 


(4.2.5) 



16 


G, R. SETH 


we have 
(4.2.6) 


e* - d = 


n - jtpin) — d) 
En 


If E{n) 23 independent of &, then from (4.2.6), we obtain 
(4.2.7) n/E(n) = 1, 

that is, n is constant with probability one and the sequential procedure reduce 
to a fixed size sample case. If E(n) is not independent of 6, then diiTr reritinting 
(4.2.6) with regard to 8, we obtain 


(4.2,8) 


n 


1 


Wn) (En) 

ad I n 
(En)^ En' 


As ^ (En) is not equal to zero for any 8 under consideration, substituting the 
value of V'(n) from (4.2.8) in (4.2,6), the latter takes the form: 

En — n 


(4,2.9) 


9* - e ^ 




Differentiating the above with respect to 9, the result is: 

En — n 


(4.2.10) 


Now if 




1 H’ 


» »)+1. 


■^(En) - 0, then (4,2.10) is not valid, thereby contradicting (4.2.2), 
(f 

^ rearranging (4.2.10), we obtain 


(4,2.11) 




4- En, 


that IS, n IS a constant with probability one. This proves that Wolfowitz bound 

^ with probability one. Tliis generalizes 
the result of Blackwell and Girshick [6] to the extent that in [6] the existence of 
an efficient estimate is assumed for all integral values of ilf instead of Af > 

SoTaM ivtr'fh ™ modifications, is 

also valid when the successive observations are not independent. 

suffioient_statistic for all 

M,’> when we restrict ourselves to r,rnh„tiri \ -t A ^ statistic for all 

given by Koopman in [7]. ^ ^ ^ density functions satisfying tlio conditions 



VARIANCE OF ESTIMATES 


17 


§4.3. Let us consider a sample of fixed size N. Let 8* together with the 
probability density function ptf satisfy the following regularity conditions: 

(i). There exists a transformation T from (xi, aj, • ■ • , Xti) to the variables 

) ^21 *'' I ^tf)t ^ 8^(xi j Xz f ' ■ ‘ f Xff)t 

(4.3.1) 

f = 1,2, 1, 

such that 

(a). The functions are everywhere unique and continuous, and have con¬ 
tinuous partial derivatives 


dXu’ dXu 


1 , 2 , 


• • , N — 1, u 


1 , 2 , 


,N) 


in all points (xi, 2 a, • ■ • , xy) except possibly in certain points belonging 
to a finite number of hyper-surfaces. 

(b). The relation (4.3.1) define a one-to-one correspondence between the 
points X = (xi, Xi, ■ ■ ■ , Xy) and ?/ == (fi, fa, • • • , fw_i, 6 *) so that 
conversely x, = 17 ,(fi, fa, ■ ■ • , fv-i, 8 *) where r]{ are unique. 

(ii) . There exists partial derivatives of g{d *;6), /i(fi, fa, • • • , fjv_i | 0*; 6) with 
regard to of all orders up to and including m, where m is some finite integer. 
The variances of 8*, hi and gt • g,, i,j == 1,2, • • • , m, are finite, where hi and g{ 
are defined in section 1. 

(iii) . There exist functions 


such that 



1, 2, • • • , m\ 
1,2,3 ; 


d'Vy 

38' 


< T,i(xi ,xt, • • • ,xy); 


38' 


< Tva(e*); 


d'h 

38' 


< 7’«3(fi , fa , 


fy-i ; 6*), 


for all 0 in D and for almost all ( 21 , ®2 , • • • , xy) ivhere D is an open interval. 
Further 

I T,i(2a j 22 , ' * * I Xy) dXu f 

f TiiiB*) dd* and f TMi , fa, ■ • • , f^-i; B*) II dfu 

are all finite, the range of integration, in each case, being the whole range for 
the arguments indicated. Then the necessary and sufficient conditions that the 
variance of 6* equals the lower bound given in (1.1.4) are 



18 


G. B. SBTH 


(iv) . hi ,hi, ■ • • , are linearly dependent con&idered as functi«)nH r»f , 
• , for any given 8* and 8, and 

(v) The probability density function g of 6* is of the form 

e* - 6 ^ 

»«1 


where Ki may depend upon d and N only. 

The proof here is given when p,v is a probability (leMisity function. It is also 
valid with slight modification when py is the probability of discrete varialiles. 
Proof: Let / be the Jacobian of the transformation T in (4.8.1). Then 
because of conditions (i) and (ii) above, ive have, 

(4.3.2) ••,2y;<9)-rJ) = ^(fl*;t?)'A(f:, fa, Iff*; <?) 

Further 

(4.3.3) f ;i(fi, fa , • • • , fy-i i S* ; 0) H - 1, 

the range of integration being the space of fi, fa, • • • , fy_,, Differentiating 
the above i times under the integral sign, it follows that 

(4.3.4) E(hi ! 8 *, 6) « 0. 

Similarly we have 


(4.3.5) . h,) - 0 

as the expectation of the quantity on the L.H.S. is finite by virtue of (ii). iMore 
generally, we have 

(4.3.6) E{F{8*) . hi) = E\F{9*) . E[}u 1 8*)] = 0 
if E{F{8*) • hi) is finite. Let us now examine 

~ 0~'tK,UN)J , 

where ffitS,(iV) can also be written as 


^3-8) Jr.(«. + (‘') 

Now (4.3,7) can be put in the form 
(4.3.9) 
where 

'«.10) - I 

clearly depend on B and B* only. 


\ <-i i^i y > 


{i 1, 2, , m)j 



VABIAKCE OP ESTIMATES 


19 


By virtue of (4.3.4-4.3. 6 ) and F{$*) involved in (4.3.9) being sudi that 
E[F{8*) ■ hi](i — 1, 2, • ■ ■ , ? 7 i) is finite because of (li), we can furtlier reduce 
(4.3.9) to 

(4.3.11) _ 0 _ g K, p.y +e\e u 6*^ . 

Tlie lower bound will be achieved if and only if the above e,\:pro 39 ion is zero, 
the necessary and sufficient conditions for which are: 

(4.3.12) e* - e = f: K, ■ g,, 

4-1 

and 

m 

(4.3.13) 2 L,hi = 0 in , fa > • * ■ , 

•-1 

for any given values of 6* and 0. 

(4.3.13) is equivalent to the condition that /i,, (i = 1, 2, ■ ■ • , m) are linearly 
dependent considered as functions of , fj, • • ■ , for any given values of 8 
and 8*. 

When m takes the value one, the above reduces to the Cramer conditions for 
the existence of an “efficient” estimate. 

§4.4, Multiparameter case. Let 6*, 8*, ■■■, 0* be the unbiased estimates 
of 8 i, 83 , 8 r in the probability density function 

Pn(^1 j ^2 j ' * * f Xli j 61 f 83 j * * * , Of) 

Pi ^ 

and the regularity conditions of §4.3 are satisfied when 0* and (i =» 1, 2, • • • , 
m) are replaced by 0^ (i = 1, 2, • • , T) and 

a0i> 00'^ .. ■ 887 h + i^+ +iT< rn) 

respectively, further let 


j 3:3 I ' * * , J 01 j 03 7 * ' ■ 7 Of) * 1 d I 

(4.4.1) = g(^ 8 i, 83 , • ■ • , 0 ^; 81 , 62 , •' • , Of) 

• ^(fi 7 £3 7 ■ ’ • 7 I 0^7 03 , > • • , 0?) 

where g and h are respectively the joint probability distribution functions of 
®i 7 ^ 2 , ■ • , 0 r and the conditional probability distribution of £i, £ 3 , • ■ • , 
for a given set of values of 0 ?, 0 ?, • • • , 0 ^, In order that the ellipsoid ( 3 . 1 , 20 ) 
coincides with the one given by (3.1.21), it is necessary and sufficient that the 
followmg be satisfied for each i(i = i, 2 , ■ • • , T) 

(4.4.2) E ^8, - 0i - ‘ .. . .» 0. 


(4.4.1) 


= Ot, 



20 


Q U gBl'H 


Now reaaoniDg similai to that in §13, we conchulo fuun tlio hIuao that the 
necesaary and sufficient conditions aie* 

There exi&fc T independent hneai’ combinations of 

Kum, .If , 'll + la d’ -h It < 

which Vanish with probability one foi any given values of the sets 

(of, of, , of) and (0i, 0,, • • , «r), 


and 

(44,4) dt Ot ^ ^ iPT p ^ Ij S', ’ , T) 

il+ia*! 

Avhere the /f’s do not depend upon ot and ff's For T ^ 1, the aliovm reduce 
to the conditions in §4.3 Wc Avili noiv give an eMiinplc in which (1 13) and 
(4 4 4) aie satisfied Let 

[•='!’ - 2I; ■ S 


e? “ i: (a, -.s)7(iv ~ 1), 

i-1 


(4 4 6) Pii[ii, xt , 

We hove 
(4 46) 

(4.47) et = t xJN = i, 

1-1 

unbiaaed estimates o( S. and Ot m (1,4 6) The (omt distnbutiun o( «f and 
02 Ja given by 

,_fl[_ id’s 

i ff Oh 1V(W - 1)' a a«J ’ 


(4,4 9) 


Jf 


/(. + J?L lig 


N 


(44,10) + 

Upseid (3 for ol Ot cemedea ivitit the eh 

14 4 nl ^ I ~ 'f '''0 uoo m s= V ii.o condition 

(44 3) IS satisfied bet not the one in (4,4 4), as can be seen f, on. (4 4 9 a "d 0 u« 

he concentration elhpsmd striefcly contains within itseU the one eiv'orbv ha 

information mata It may be noted that for m = l, the erndiS S .4 



vahiance of estimatls 


21 


meiely lequires that a system of sufficient statiatica cKiata foi eatimatmi; 0^ ^ 
02, I Ot The reason la that the condition (4 4 3) takca the equivalent form 

(4411) ^‘ = 0 

( 70 , 

for i — 1, 2, , T identically in £j, * * , tlint is, that k js free uf 

01 , 02 , ) Ot • 


6 Miscellaneous In §5 l-§5 3 wo disciisa certain praporLies of 0 i(») Ill 
t|5.4 W 0 obtain conditions under 'ivJiich thcie exists no unbiased estimate of 0 , 
having a finite variance, which is functionally dependent upon a given unbiased 
estimate 0* of 0 

§5 1 Assume that there exists an "efficieuV^ statiatic , Kj , , sjv) 

for estimating 0, in probability density function (or probability) 

Pn{m , } ’ j , 0) 

That is, 

(5,11) 0*{xi , Ki, ■' , :tN) -- 0 ~ K 


where K as usual may only depend on 0 We postulate as UHual the exiatouce 
of all paitial dciivatives of pN of all oidcrs and also of K up to the third order 
with 


(5 1.2) 


d^K 

dO^ 


= 0. 


Fui fcher we asaumo that 


■ ,X«) 


where 


/ T,(xi, i4 , ' , Xjv) II dXu IS finite for all i 

J u-l 


Under tho above assumptions we will show that 


MN) = hh{N),UN)r ’ 

form a set of orthogonal polynomials in tpiiN) with respect to the weight function 

Pff{Xi ,X3t • 0). 

Piioor We can easily see that 


(6 13) 


11 ^ 9i W 


where 0i(A^) is shoilcncd to iji, for convenience 
respect to d, 


fr. 1 


8<f}i 


Djfferoutiating (0 i 1) witli 


1 d/C 


1 



22 


G It SETir 


l«et ug designate 

(6 15 ) 


1 d'K 
“ K dO' 


for all integral values of h Fiom (6 1 3) find (5 1 i), it foHf>u‘< tliafc 


(5 1.0) 


^2 ^ 


Differentiating (6.1.0) further with regard to fl and using (6 1 3) and (5 I U) Nvo 
obtain 


(5.17) 


03 0L02 ^ ^22l0a 




Differentiating (5 1,7) with regard to 0, and uaing (6 1 2) wc got 

(5.1.8) 04 — 0103 “ —2?i 03 — ^32a d- tftj, 

We asaume generally that 


(6 19 ) 01+1 ~ 010 V = ‘ 

Differentiating (5 1 9), and employing (5 1 3)^ (5 1 3) and (5,1,9) wo ohUm 

(6.110) 0,+i - 0i0,+i ‘= -(i + l).?L0i+i ^ d" 

We know that (5.1.9) holds for i = 1, 2, 3j 0j being taken ccjual to one, and 
we have proved that if (5 1 fi) la true foi i j, it la true for t ^ j + J, Thua 
by mathematical induction (5 1.9) holds good for all integral voluea of i. 

It la also clear from (5 1 6) and (5 1.0) that 0, can bo expressed as a poly¬ 
nomial m. 01 of the i-th degree, the eoGffi.oietLt of 0[ boiug equal to uuitj'. 

To complete the proof of our assertion we will now piovo that 

(5.111) S(0, . 0,) « 0^ ^ =1= j 

From (5,19) 

(6.112) 01 0. == 0,^^ -f. ^ 0.-1, 

where, 13 »ny positive lateger. Wo muUiplji both sides ot (6 1 12) by *1 pud 
reduce ev^ product to a Iiiteav oorubmation of 4,1.1,4, and 4,., with iho 
help of (5 112) Repeating tins piocess.; - 1 times 0 < *) U foilowe lhati 

37 —I 

(5 113) 0[.0, =4 0,^^ ^ ^ 4/-0K/ 


^(0* ’ 0)) «0, t =1= j 


(5 l,H) 


.^( 0 ! 0i) =* 0, 


U < *). 



vahiance of lstimatls 


23 


NoW", amce<^; is a polynomial of tlie^Ui degiee in 0i ivo concludo that (5 1 11) 
IS tiuG foi all integial (positive) values of i 
TJius we obtain 

(5 115) UN) - 1,\ UN), UN), - , UN), , 
as a set of oithogonal polyiiomiala in UN), tho weight function being 

Furthcimoze 

2^-3 

(6 I 16) 0 I ^ = 02i_j ^ d'u ^ '/'a(+i-M + fi2i-2 01 

where 




24 


a. n BLTii 


A, and fi,, thn coefacenta of and 0,-, respeotirnly m (6,1,12) for tho nbove 

four cases are given as beloi^* 

A, 

10 * ^ 

. i(t — 1) 

2 2t/d "T 2^a 


t(l - 29) - 0_ 1 


liV 


1(1 - 9) 9(1 - 9) 9(1 - 9) 

4 »/9 iW/9 , . , 

It may be mentioned Hmt m all tbeso cnfloa (^il nro alw a cortiliicto not of 

polynomials 

§5 2. Let t, K.^.(’i). wliera K,(i = 1,2, • ,m) depends upon 9 bo eucli that 


(-1 


2 a-nd <^,(u) satisfy the legulanty couditiona mentioned in §2 2 Then 


1-1 


Tvo will show that £ iCii^t(7i) cannot be a function of , Xs, ‘ • i Hn alouo &xcopt 

•-1 

for constant zero 


m 


pROoPi Let us asaume that £ K, > is mdopenclont of 9, that is, it ia 

1-1 

soine statistic, aay, 

jfi 

{5.21) fl*(Xi,X2,' ' , Xn) ^ 2 Ki'tf’Xn). 




Taking expectations of both the sides, we obtain 

m 

(52 2) ■ •, Xrt)) = “ 0, 

Biflerentmting (5 2 2) j times with legard to 9, wo Imvc, because of the fogU' 
larity conditions on and 9*(xi ,13, ’ , Xp), 

(52 3) IiJ(fl'^(afi, Xa, , x„) ^,( 71 )] =<0, t = 1, 2, ■ ■ ■ , ?«, 

It may be noted this is similar to the result m (2 3) Prom (5 2.3) and (v*) 3.1) 
it follows that 

(5.2 4) , - , Xp)f ^ 0 

Thus 8*{xi , s,, , Xn) IS zeio with probability onOj that is, 

m 

£ Ki ipXn), 

1-1 

if independent of 0, is zero with probability one Tins proves our assertion 
that this cannot be a function of Xi, xa, - - , alono except for conalant zero 
From the foregoing wb deduce the following conclusions 
I ^i(n) or any power of it cannot be a fuiictiou of the obaorvations freo of ff. 



VAHIANCR OF Lh^n^^VTLH 


25 


II If a statistic 0*(^l, 'Ca, , 2 J„), which is not a constant with prohahility 

one, can be put in the foim 

m 

(6 2 5) , Xi, > , Xn) == Ka + ^ Kt ff'iin), 

' i**! 

where 7 n is some finite positive integer, then 

(i) /Co must depend upon 0, 

(ii) The cvpiessioii (5 2 5) for 0*(x: in <^,( 71 ) la unique. 

(ill) No oihm unbiased estimate of Ko satisfying the icgularity conthtionB 
can be put in the foim (5 2,5) 

(iv) Wlien m = 1, theic is no othei statistic except aO* -|- 5, where a and 6 

aie constants independent of 0, which can be pub m tlic aliove funn 
iCo + /Cl /Co and /Ci aie dilTcienliable functions of and Ah does 

nob vanish for any 0 iiiidei consideration 

(v) Let f be any function of xi, 121 , fica of 0, witisfymg the u'gulunly 

conditions of §2 2 with = 0 Since the covanance between ^ anil 
0*(xi j X 2 , • } "Cti) m (5 2 5) IB equal to zero, the ataLis(,io of the form 

(5 2 5) has the least vaimncc of all unlimscd estimates of /Co that satisfy 
the legulniiby conditions of §2 2 

Also, if the probability density or the piubabibty function depends on more 
than one pammeteij tlien all the above reaulla except (iv) iiold good if 

t K. (») 

1-1 

13 replaced by 

ii+ial" 

§6 3 Let us now prove the assertion made in (iv) of §5 2, when in 13 equal to 
one 

Suppose the contrary that there la a statistic la, ' • , Xn) which is of 

the form 

(5 3 1) , xa, , acn) ■= Ld + Li * 

0*(xi , Xj, , Xn), of couTse, has the form 

(5 3 2) , Xj, , Xn) Ko -h /Ci ■ 0i(n)f 

Wo will assume iCo, /Ci, Lc, Li to be diiTerentiable functionB of 0 and that 
/Cl, Li do not vanish for values of (f under consideration 
BiTferenbiating, with reapeeb bo 0, tlie expressions in (5 3 1) and (5.3.2 , we 
have 

(5 3 3) ^ “ ' (^1 + 7^i(<^ — <^?) = 0, 

(5 3 4) 4}i + /Cl (02 — {f>\) =* 0, 



26 


a n SLTrr 


where 0, is shot fc for 4>,{n) Taking tho cxpcclatioii*^ af l(io iiliov c and rei^rran^- 
inp?, it foUcTiVa that 


(5 3.5) 


ri/.2\ i ^0 i 

/C, dO ' 


From (6 3 3) to (5 3 5)i we deduce that 

IdL^^idK, 

Now solving the above cIifTercntlal equation, wo got 
(5 3 7) aKi , 

where a is a constant independent of 0, From (5 3 5) and (5 3,7) it follow r that 
(5,3 8) Ijo — tt/Ca "k 6| 

wheie 6 IS a constant mdopendent of 0 Fjom (5.3 7) mnl (5 3 8)\vn roneludo 
that the st&tiatie m (5 3 i) nuiafc be of the form n -h fi, wiueh pmvcH niM a^iM?r- 
tion An immediate confjeqiienco is that if there o\ista nn onivicnl Hint win' for 
estimating t(^), then no other function of 0 except iv yiO) 4 h can hiive nil effi¬ 
cient estimate,^ 

§5 4. If , Ki, ' , Kn) 'B an vinbmBctl eatimate of 0 salisfying the foUim - 

mg conditions 

(i) Amot^ all iinbiaaed esfcirriatca of Q having fuiilo variances, which are rduu 
functions of 0*, 0* is onowith tho leant vannnce, 

(n) Foi all d them exiata a completo act of polynomials willi rt^jjcet «> Uio dw^ 
tiibntion function of 6*^ then thoro cxisfca no unbmacd cHtimaln of 0 with 
a vaviEtnce, which la functionally depoudent upon 0*, oxtojiL D* 

Pboop‘ Let $* be the mibwaocl estimate of 0 which Ima tim least vjirlmme 
among all unbiased estimates of & whicii are fimctiona of d* Further let *7^*) 
be any function oN* free of 0, whose expectation ovists imil m eqUAl to n-tii 
Let the variance of S(0^) be finite It is well known tlmt for any such S{0*) 

Now 6*^Sie*) m turn having expectation equal to zero, wc obtain 

= 0. 

Repeating the above % times wc obtain, In general, that 

F(F*'fi(0+)) « 0 

' We aadume the exlatoace of ^ 1 rind - nii a , , 

de< ^ pojlUihue ihrtt 

and mA) do not vaunh for any 8 under coriBuleratioii 



VAnUNCE OF L'iTIMAri.a 


27 


for all positive integcis i. Fioin the above, ^\lth the help of coiidilion (ii), 
we conchido tliat 5(0*) must be equal to 2 cio< Tlnis if /i{0*) ib an iinbmed 
estimate of 0 witli finite vaiiance, flioii fiom aliovoj //(O*) ^ fl*, having the ev 
pectiition aero and a finite vanmice, must bo sicio with piobability one Tims 
II{f) 15 the same as f, whieli pioves tlie icsiill 

Exawi’le. If 0* IB of the foim (5 2 7) aucl condition (li) la BaliBficd, then 
theie J3 no function of 0*, fioe of d and having a Unite vaiiance, whoso oxpec* 
Lation IS Kq 

Condiiiona (i) and (ii) above aic satisfied foi estimating d m tlie cxanijiles 
quoted at the end of the section 51, and thus in these cases the icsuH holds 
good when /?* is the efTiciGut estimate 

I am liighly thankful to Piofessov J Wolfowitz for Ins gvndaiicc and help in 
this reaearch, 


REFERENCES 

H) II Mathmatml jl/fiiliofla of iia/isOcn, PiiiiLolon Udiv Ihcss, 1910, p 480 
|2) 0 R Rad, “Infoiiimtion and tlio accuiaoy aUaiiiablo m Iho ostimalion of staliglical 
paraiiiDtcrfl,’* Cfllcidiflilini/i ;Soc Dnl/,SopLomhoi, 1046 
|31 Ai IliiA'i’TAojiAnYyA, "On somo analogues of llio amount of inforinalioii and thoir uao 
in Blaltfllical csLimntion/’iS'tfriUi/fi, Vol 8 (11)10), alao "On flomo onaloguoa of the 
nmoimb of informaLio]) and Ihoir uso in Bbabiebical cBtimatlon," 5anU{/a, Vol 8 
(1017) 

14) IL CiiAMk, "CoiiliibutionB to tho theory of atatiotical fiBtinifttion/' /S^andmawfl/p 
AUur a(ls,VDl 20 (1010), pp B-04, 

16) J WoiiFOWiii, "EfTicicncy of Boquonlml osLimatos," Ainiola oj MflUi 5fo(,VoUS (1047) 
jO] lliAcitWKUi m OiTiBiUDK, "A lowBY bouiid for the vananao of Bomo unbiasod Bcquen- 
tia! cstimatcB," ATinnls of Mal/i »Slaf, Vol 18 (1047) 

(71 B Oi Koopman, "On d^Btributiona admitting a flufBcioat sULiBtiD/' Am W/i Soc, 
2'ranB, Vol 30 (1036), p 300 



ON THE THEORY OF SOME NON-PARAMETRIC IIYPOTHJ-.SFS 
Bv K L Lliimv^n.v a\j) T 
Unwemty of Califonun, HcrkHry 

Summary Foi two typca of non-piumnpirH' liviviiHirs* ojifuumii ^ .k 
are derived against ccitain classes of altcrmilu c^i kiiul^ of 

are related and may be illustrated by tbc foUnNMUg i'^ninpU* (H 'Hio pmu 
distiibution of the vaimblea X,, < - ,X,-, I'l, . 1'. im.vriua mub-r Mi 

peimutations of the variables, (2) the variables am jkKlepvndnUlv imi idtMUn’.dlv 
distributed Ib is shown that the theory of optimum teaUi for li> iwillw'^s nf ihr^ 
first kind is the same as that of optimum similar for hyiKiVhr'#--^ of Urn 
second kind Most powerful tests arc oblainod against nrlntrnrv i*uoplo niUTna- 
tives, and in a number ot important casea most stringtail nro ilorivM 
against ceitam composite alteinatives. For the example (l)p if tlu'' tli'ilTilaitiopa 
are restiicted to probability donsilies, Pitman’s test baiHai on j} - / la 
powerful against the alternativea that theX's and Y 'a arc imlrjirmb'nUy lion nail v 
distributed with common variance, and that E{X{) ^ $, LfF,l »l wlmro 
Tj > f If jj — t may bo positive or iiogative the test IimhihI on ) j 7 " / ’! m hum'* 
stringent. The definitions are su/ricicntly general Lliat ihn tbc^iry niipbr'a l+i 
both continuous and discictc problems, and that ticrl obficrvntiniin prrwnl nn 
difficulties It la shown that continuous and cliscrcio problrniH nmv la* rmn- 
bined Pitman’s teat foi oxamplo, when applied to rcrlain iliwroh* prolihmin. 
coincides with Fisher’s exact teal, and when vi “ n the toat uii lu H in 
most stiingent for hypothesis (1) agaiusL a broad cliiitH uf ullurJitihvM him It 
includes both discrete and absolutely continuous dislribuLuina 

1, Generahties In the present paper wo study llm prnblnni of running 
optimum testa for certain non-paiametiio hypothecs IL is iriijwirUiil tit (hm 
connection to make some diabmctiona which nro of lessor Higniliuirii'p tJuui llm 
problem is appioachedfiom the intuitive point of view tvliirh him lu»i‘n niMloiimry 
in this field, Consider for example tlio hypolheaiB II thuL /i| , ' , X* nn' 
mdependently and identically dialiibufccd accoiding Li> uii urikiiowii jiruhnlmliLy 
density function All testa which have been suggestLHl for Rfitiiig // iiri* Milid 
also for testing the hypothesis IV that the unknown joint probabililv tJniitfitv 
function oE the Z's ia symmetric m its N argumoiiU. On the ntluT linml, 
which have optimum propoiUes for testing IV (igninal n < criniii eln'.-< uf niUTiin- 
hves will in general nob possess the same piopoiLies when IV ih hv H 

From the present point of view the two hypoLliOHcH numliuiUNl rm* i^^-w'iiUmIIv 
tfferent IVe shall be concoinecl in Ihia papei primarily \sUli gejinruInmiiMim 
ot w , and we shall show that many of the teats suggc-slod m the hlmUxm liuvo 

hypothoacs of ting kind against eertnin of 

The correapondmg general theory for hypotheses related to // u nmto (liireront. 

2S 



srtN-i' luwu run inrniin sm 


29 


vrr III!' ^ tin (ninndr*, i)ro\ nlnl fesls of llicso lattei liypolhesos 

j,rr U'^iiu UA tn Mrml jr n ginh^ '^ppn(i<nlly, all lesults on optimum 

lf*4m II] ir tin- f-<[iji\,ili*nl to ))U‘ ( nm-MpoiniiriK iphuUs on uptmiiim siiuilm’ tests 
nf A/, .mil iliiK rjiijiVuliTu r< hnlijs iImi Tor many of llic moie geaenil liypothcsea 
1‘on^iflt n*l lu Ums paper 


ll In* uli’n-rvctl tlml ui mim\ oitponmental situations, the hypothesis 

//' (linl (lie jnmt fjit'lnlmlion of tlm yi'n is mvariunt mult'i nil pcimutiitions is 
riuifi' r( limn (lie li\ potln'sis // (hut the Z'-i are iiuleprndcntly ami idctiti- 
mlh ihslrilmlid Knr oKfunple, Hnpposi' ihnrc is a hlork of land divided into 
m -I* n pliiiM, and (In* rvfK-rimpnlr'r wrmls Li) tost wliether one of two foitilizers 
(iiHfs! Ill li\isl uimnint'*] la nioro ofTi-olivo Ilian the olhcr in incieasmg the yield 
of n oorlam phuil (If iiho ploln, m are t liosen at random, fertilizer I is applied 
In tlu's**, anil fortiliMT II lo I ho nllior n If denotes the yield fiom the ith 
pint Ui ulnrli forhlu^T I htis boon ajiplieil and 1', denotes tlio yield fiom thejth 
pint (n ^Ull(fl forliliacr U Iins hem apphctl, ulieie the plots are uuinbeicd at 
niiuinin, llioii Uie lupnlUofiis that iho Luo fertili^era are eoinpletely equivalent 
iinphm llial the apphoatinii of anv permutation to A'l, • , AT^ , , » K„ 

dncH not eUunKO llieir inml distnhutum But vt is not icaflonublo lo suppose the 
,V(, y, are irulepi'iiileiilly anil ideniieally diHlnhulctl, sinco tlioro may be uitiinsic 
difTererum uiiimig liie pints For diaeuhsionH of these and related points, see 
Fialier |1), Xvvinaii (21, Bit man (H) It may be Lliat in many particular cases 
anine liyiMilIuiaw l>elu mi the lu n m really appropriate but the hypothesis H la the 
only nna that evideiiLly ujipiniiriatc froin a eursory inspecLion of the setup, 

Many nf llin aliernaUvn hypolhews considered below, for example those 
iimilving lUpniirtliLy, arc dirlaled more liy tradition and enso of tieatment than 
by apjjrnfinnleiit^ in uvliial experiments Thus this paper should not be 
mnaiderfxl «« jirtividing ahsohUe justilii'aHon for teats such as Pitman’s but 
rullu-i U4i aiiKRe^UnR a inetluKl of obtaining optimum non-pniamotric tests when 
the claiw of alUTntttjvea m fairly uell spei'ifiod. 

Another p<>saibihly, Fimt raised by Noymau [2], winch baa been ignored in tins 
paper iw the ixpiaUty on the averagaof tlio two feitilizeia but with fcitilizer I 
imvmg K larger cliKpersinn than fertilizer 11, or a diatributioii differing in some 
other flmriu'liinHtic. It uould be reasonable to consider this na part of the 
hyjioUimis Ic^Uxl, but tents based on rruicbmizatioii may give a probability of 
repx’litm of thiihypolhwtiB of oquivaloneo in this case which is much higher than 
tho fltutrxl luvcl of BiKuificaiico, Wo hope to totiirn to problems of Una type m 
biUr puiKsre 

Lot u« mal<e the following basic usBumptioiis. S is n space of points z and 0 
IS an additive class of aubselH A of %. Any member of Cf will bo said to be 
meosuriible By « probability distribution wo moan a moaaiu'o F, defined over 
Cf for wbirh FC2) 1. \Ve shall bo concerned with two classes of probability 
distributiona. One, tlio class of all distributions, and two, the class of distribu¬ 
tions which are absolutely continuous wth respect to a given measure g, that is, 
the class of dialnbutiona F for which there exists a function / such that 



30 


E L, LEHMANN AND C bfl IV 


(1,1) ^’(A) = [ f(z) ihiz) 

•fA 

"We shall call/ a generalized probability deubity fimctitm with r }.i |j lu 

Z we denote a landom vauable ‘ 5 iicli that for any 1 in Cf, 


( 12 ) 


P\ZeA] = 


Foi most of the applications wc sliall take Z lu he a iMirlidr'in i\ urn! (.f 
to be the class of all Boiel sets Then if m is Lehe‘igiie measnrn, (I 11 ^t i 4 m tlur 
/js a piobability density function m the usual hcii^t llowen'r^ wo uli.ill li3i\o 
occasion to'considci also some meftaiirca other than JAihengue riOM'^uro Hv a 
hypothesis H we mean a class of piobabihly dislubiilums X<'\( ui* th ^ rihi* 
the hypotheses with which we shall be coiiceuied l.cL U ho ii parhinru id 
that is, let n be a class of mutually exclusive siib‘ioiH iS of Z hucIi Ihnl o\nv 
point 5 of 2 lies in one of the sets S If two pomU zi anil ho in lUo 1 

ive shall say that zi is equivalent to zj with ic&pect lo II. ~ jj (miwl n > *j j|,, 

set of all points which me equivalent to z will be (Icnolcd by T(r), tho nuiiilir r <-f 
points of T{z) by n{g) Conceinmg IT wc make the fuliow mg 11^*^11111111 huon 

(I) All sets in If are 6 nite, so that n{z) js finite for all z 

(II) If we define iS'„ as the union of nil those sets .S' of If wliicli uiiihufi i)y « 

points, there exist mutually exclusivo sets , iSt,”' wliuh an* lucaMirubh' 

and such that every element S of 11 contuiuinK cxuvlly 11 Una uno hml 

only one point m common With each Si'* 


AVe shall say that a measure li is mvarmnl uniler II if llie folhmuig mjirbij.m 
holds For all n and j < n, if 5 is any set contaiiictl m .S'L" and if S' ilvrmijr-ft 
the set of equivalent points in then «= ; 4 (.S') 

^ II satisfying (i) and (ii), wc furjiiiihuu ihi,‘ liv|williws // 

that the distribution F of 2 is invanant under !I \\\^ Klirili rDfr-r"bi // Hh ehr^ 
hypothesis of invariance undei U Wo shall also eonmdvr the u( 

invariance undei a partition for a class of gencializcd ilerisUin / Ui Lljia 

we assume tot to measure , of (U) « gtveu, and (|>at II, .uM.li.m U 7, 
and (ii) satisfies the condition ^ 

».r. u.,k 

Btmes that Si ^ ^ (mod II) implies /(^j) ^ 

to to mto "1 " r "''!<■ f'" " 0 <'■. 

d Weto tL cilfllor 7^ ® " .... 

of to difficult eucouatered bv ?rl 
The aize of a teat if la defined to bo 


a.3) 


^fv>) = sup J ^{z) aF( 2 ) 



rvuuu riMf lo ‘hm 


31 


If 111 p irfS’ »sl nr 

(in 


fur 'ill r »♦ finV 1*^ 1*1 Hnunr fnr // LMcmlmg llio tmnmology of 

\\(* '»iv *hai Im filnu'lun* »Sfn if f(jr jill z iti iS„ 

Cl Xl v'ti'J »f 


llic fnlldiuiij^ niimm f itiriuK h rr"*nll iif KtliofTf? 

J ni\lA I /V f'-ffiNj? fj hi/pi)fh‘rm af iJiiiirtniiri', any test of slruclure S(i) 
IS mwi/ar mid uf 'tk-t 

Th'idf lAir F in if utiil any ^9 


(I 0 ) f X X “ £ [i„ [ E ¥ 5 ( 3 ')] dF(z) 

J A-wJ i-*! " »i«l 4 'lT(l) 

Unl (|S 1 i it* ‘'Init niff* iSni fJinl lirnc<‘ (I ij) IioWh for all ^ Tlicicfoie 


(IV 



» - 

X «« / (fF ea d, 


Wi* fllmll wlum nr\l ihrtl fwr Ic^UtiK ft liyiiotlie^iw of jnvormuce at level of 
MHiiilnitiirt’ t, only of hinuUiro iS‘(0 nm\ Ito coiniderctl. In older to make 
lliH rrwill ftii|ilit^ft1)h' IkjIIi Ui hypotlle»c^^ referring to tlio cloaa of all diBtnbuUona 
Hiid U» ilftMsri rpfeiiing in ft rlnifeti of ficftomliKCii denaities, '^e shall aUtc it in an 
n*tVfrirTirlrir fttmi \0iieh whow taken Uigothcr with lomran 1 indicates the essential 
t4|iii\iilrJif e of (lie bwi lyjH^a of liyiJiillieoes 
1 vVJA M\ 2 // vn^ any of n of tri L^rioncc/or l/io class of generalized 

(iftmlm u ifh tf^pctl la ctfixr^i /jim-^nre /i, oik/ ifihe siae of y> is less than or equal to €, 
Ihtn f/-isfi! o (tM ^ of ^ruclure .S(<) surA dial 

(u) {fi tir ^ IV <ip 


for all prohahihiy (f^$inbni\mn F 
PrttKiF. F'mi wo ftlmil hIkiw lluiL 


(1 9 ) 


I s L <« 

nfs) i'ir(i) 


ftlrno^t everj'where n> let d he the set of pomls i such that 

(1,10) -y , X vsC^) > * 

n(2; i'tr(r) 

and iiUpiKiM Umt a(d)m poBilive Lei 


(l.ll) 




if ZiAy 

elftowhere. 



VI 


E, I. LbinfAN.N IM) 

^ f t pr 1 III 1. 7 1^ (‘'tit 

Then / .a ta a s'n<=“ by ''’ 

in /I, But , 

(112) 

,ncontmd.ctianto the«l|on th^ , ... 

From (1 9) it ^oiiows easily tlmt thwo cmrih iv ^ 
sucli that br all z 

tpR^s This cine^tion is auswoicd by » . * , , 

iLmil Ul,b.<^m<^nramrdc,^ni IM M.^ni \U Wiu^n,U,n. 
dZiaUsftimo^ndUions (.), ft) and (lu), and nuck lhat z - i (i.«k 1 11 ,, 

J n- J* (moli lit). Far ike dots o/ gencraliicd driisilifs ifHA rryKrl h /, ilnmy ,; 
ff, (z 0 = 0, 1) hvvof'hms of lKlJ^lr^a?ICC rcbfur \o H. Thr}\ jm 
QOaiast Hi at level of dgmficance «, Oic iotalUy of iJurh Ui} hm^ 

SU), md jor M (b) 2 (mod II,) imphr'i ^^ 2 ) - form 
complete class of tests, 

Proof It is easily seen that ive enn restrua ourni'lirs hi lliti* ul 

testa of atiueture which possess propciLy (fi). I'ur if ^ 1 ** miv i.f j'lnif 
ture S{i) relative to Ho, let 

(114) /W = M 

Then clearly <p* possesses property (b) ftnd llaa hIuu ture »S(il hurlhiTHKin* if / 
js any probability density function of 7/i, then 


(ld5) 


f uAf du = f iJ)f (lu, 


SO that <p and <(>* have identical power ngainat Jf i. 

In older to complete the pi oof, ivo/npat show IIkU if v?, mid nfi^ niiv Iwii 
safosJymg (a) and (b), and it v»i 0 differ on u net «f jvurtilivo inva^iiri', Uivr<' 
exists a piobability density fimctfoii / of /A fur which 

(116) I V^j/d/i > I 

Since both and have structuro j3(e), tho set j-l of poinlH z fur ivhit li 

(117) Vi(«) > ^( 2 ) 

has positive measure. Also, because of (b), if two pom la nre rqiii\HlcnL rein five 
to Hi, they are either both in A or both not in A 1( fiz] la dufinwi tin Wp(A) 
for smA and as zero elsewherej then f is in Ih and satiafiea (1 10) 





33 


lln f-n-in f(lt» iin<4 |-^^.in llii cn in I l)V li'Uin^ llii* liypoUicei’H lU and Ih 
rrfi r lt«* Mn' 1 1 1 ' ^ diHinhiHnitiH rulli<T tliiin to a particular das'! 

hi 4 flni'')! If % |4 1 p ,ivlv filf+dj lnit‘, and t n»i^ llit'wi t\\o tlicoroma 

( hlll4 ,mI O i r 

.‘’irn f* llj< ]>4fnrrfi4 for a |JotJipfii'i of invariance //q 

rcfimtiK H 'luH« hf K'^inri1ir^4 dniHiiirK aR-nnhl an aUcrnaUvo / from this 
rl.'i^Ti hf rh n^iinIm < Un* li^rrcl mjo' nkh fur llic \Mdor hypotheaia /7o 

rofrrnnn Ih Hiu 1 1 im nf ^11 rliHirilomnUH, vs ih alKti muHt ptmorful for tcatinfi; /7o 
annint»l / 'IJifM f^rr‘|^'*n4uiii n ni irk Inilih fur inuHl HlnnRcnt testa. Therefore 
all uphiiinm will Uhtlirin^fl in tlin lliruugli ihouBCof tiicoroma 

hf tiiH *'r*4lihii, inn\ Ut' MMiKidirtnl iu< of hvpollicfwti referring to the class 
nf all ilittlrilinioiii''^' lln \ \rf vaJnl fiRuiiifl llif^f In puthcHca, imd no power la 
ftaiinsf] IpIi r^rtHinhiJK ilu* lu <Jii* a|jproprmic dflssof geticializwl 

rh'Kttil ir-^ 

3» Most jioweffnl lesls and most stringent tesla. One of the mam probloma 
lu lrt‘ ihjiMali n4 in iIuh pa]nr 1 *^ ihc 4i luiininalioii of a most powerful Lest of a 
liypnllir^H ul iinarmiiti ag.iiiiJfl ii ‘iimplt' ullorimlne If we restrict our con- 
holcTalionh In tin* ♦ la»H of tlNmilics with respect to /ii n complete 

wilntioii of ilh'« prhl*lciii h given liy the following 

T»M>iiMu ti !rj II f*f thr nf ntukr parhlmt If, and 

hi y Iw a jtridHilaUiy dtnffdy junciiau twl in // For any t in Sh denote by • *' , 
t"" ihr n /wnaft nf 74^1 nrrangfd jmj (hal ‘ ^ (?(*^"^)‘ For 

fe^Uny // y u niiwf i^ferful (tnl oj sue < m pircu by 



f‘ 


ff(.) > »(»"*■.)' 


(2 1) 

1 

1/ 

ffU) “ ff(‘' .") 

» for z in fSffl, 


[o 


g(i) <tf( 2 '""*”), 


Si 

irhw 52 w. 11#, 0 

< 

n < 

1 and lo/w^rij a may depend o/i z through 2\z) 


pHot»^ Fiml n 0 ohi«tn e that Dio number of for wluch q{z^^) ^ 

IH grenliT tlinii or (n^iial to l + [m] > fit and that the number of 2 for which 

irt loKH than or equal to M ^ *n, so that there exiats an a 
befwoeii 0 amt I for which 5^(8***) n< »Sinco ^ has structure iSi(fi), it follows 
from Inuiim 1 that il w Kunilar and of suio e, 

(2 2) 6i*(^) - 

To t'UTiiplele llm premf cuiiHicler firHt ihe upocinl ease that 

J y*(z) dy(z) 


(2 3) 

vaniHlics Tiion 
(2‘i) 


vsp dp « I p dM *” 1 



34 


L Iv LLIJ>JVN\ VND ^ srJ tN 


that IS, title test ^ has po\\er Ij uud LlierehHf Ji rlr'^rlv nuj'-f rJtil A- imjf. 
neKt that the integial (2.3) ispobiti\e Tlit^n 3 * h pjolKairnn! 
density function of// For it la moiifliirnblc iiiirl ^ llu' ■**isuia« Munian'ui 

lequiied of a meJnbei of //, and tlic intcgiiil (2 3) *•1111 

S f e*(i} dll (s) < 2 — / £ duU) 

-1 

/ X» -1 ^ 

£ »i / Jl (>*1 ll 

The test ^ therefore has tho form of ft proliftliililv raliu U^\ m 

similar, it follows horn theorem 1 of (11 Unit v» H rim'll in*^W‘rlful 
In practice one is usually inteiealcKl m coiiii>o‘<il^ ratluT tJi m ^iinjtlp nn- 
tives We shall theicfoie consider next the prolilcm nf derji ir?i^ iK 

tests of hypotheses of iiivniiiiiicc agftinsL cerlnin '‘Hih 

problem may be reduced to Llmt of (indmg li»\ls \\imii iln* niuiniiuMi 

power over a class of altei'iiativca by the following nmiple Hmu 

Stem [7j 

Theorem 3 Oiven ahi/polkesi^ II and a c/nisi cif fd/mioJitSffel, & < U, */nii4r 
hy 0*{e) the envelope power function correepondmg to fhr fr/rJ ^»r i 

t/lfll IS, Icl ' * 

(2 6 ) ^*( 0 ) sup^(v>, 0 ) 

p 

where /3(^, 0) stands for the power of (he test v» ayainsl /hr nf/rnniJtir Qt atul sr l^rrr 

the least upper boufiduial'en over all lestsy, of size I Ui Jfhi a fU,^r t,( 

ally exdnme subsels 0/ SI such lhat USh « SI mid fiiK'h ihai n r^mrUn^i f^n 
€(Kh (k , Denote by tpi a iesL wheh nittximi^ci! [he iriinniniin jkiih r n <r // 

<Pi - ifl mdependenl of 5, then (/> ts mst stnngenl^ for ieutwg H rjj r \^4 

siffnijicG/'nce t 

For obtaining testa Minch maximize ihc miniiiuiin inmcr over k ,.f 

.. 

Thkobbvi 4. Ld li he a hypoihme of mtartancr, anil Irl H. fc- tU- , In , „! 
allernalu^ 1^,1 s,a Suppose there exists a suhscl ll' of il awl a i,r.M„l„; 
measm X oeer ^ suek that for the test .ofsise. .efincil as ,i Ia^;. ... V:,!’ 

( 27 ) 


(/(«) » f oi,(j) ^(J), 

vtt' 


the tnteprol J dn ts cofislanl for 0 m Si', 


and 


( 28 ) 

T/ieft 


J 'POt / V’fffl' d|i /<ir all 0 tU,0’t IJ' 

^anmize^ ie miwnum power over Si al keel of eigniftrn 




mdope p„.r apd^^or°r’,‘,haI.'',f If II 



NON-ruuMt Jlur* llVKUlltsi 


35 


PjtMiif lU llifMuhi a, ^ JH II mo*iL tc'jt for U^ating // against g, 

tllfll 1^, f'll I!ll> W*' 'tf “171^ t ' 

(2'U / ^ ff/cj Mt) ilfiU) < I ^-(s) y,(2) dK{0) dfi(z) 

(<iji*nN|iu inlv 

lllf [ '' f fJS(0) f (fjj(2) 

*»U •J Jiff J 

‘ j fhi 2 ) gtU) (tK(6) g J yj(e) (Ifi{z) ^ gt(z) d\{0) 

j I MfftU) (iM^) =“ inf f tp(z)gt{z) d[i(z), 

JU' J 01(1 J 

3. Normal dUcrnativcs, I/*L It bf (lin liyjiollu'His of invtiriiiuce uudcr II, lot 
'Ut) ln '\\w of t'UvnMilail lo z (miwl U), ami li^t/ and p bo two hmctiona 
df'liiUMt n\ (r Wt^ Htall v^rili*/ if (Iutc a fiuKJlJon F aucli that 

1) m - F{g{zl T{z)], 

^slirrc fnr »ii\ T(z)^ F jh n «l nelly nitTouaing funclioii of g. We iioto tlmfc 
f ^ g III ibc fdllrtwjiig hM) ml enHOrt' 

(M A &} « \iln‘rc' F is alriolly lucrcafliiig; 

(ill Jiz^ ^ b(s) wliorc <i(jf) > 0 for iiU z, and wbero zj ~ Zj (mod n) 

jrniilim ‘ b(zi) 

(d (hiH lUiUUiun ttlema from tlio following rccoatlc. Lot o* and 
lio dofiiiHl III (2 3) mid (2.1) rMjiec!lively and let / ~ g, If the tost \p la 
oblmntHl /rout </> by wih^ililiiliiig/ and /* for g and cf* rcapectivolyj tlion ^ ip, 
The pur|H>H‘ fd ihe iireseiiL section is to obtain most powerful and most 
slrnigf'tiL ir'^iiH of Mono hyrmilioacs of invarianeo againab certain olassos of normal 
uUerntiiivni In parliculiir^ problema will bo exhibitetl for which varioua 
nori'pm'amclric u-sl« suggt^KKi in the literature possess these optimum properties 
rwint.vM U SiipjaiHc that tho random variables O' >=' 1, ■ ■ , Si ; 
I 1, ■ • , «i) ha\o n joint ]jrob«biliLy density function, and denote by II tho 
liyiKilheniH lliHi (Ilia iirobabihty tloiimty is invarjant under all peimutations 
of the Hi armmieiiu wilinn the Mb gioup for » =“ ii ■■ Consider tho 
nlUTimlive lit dial nil variables aro independently distributed with common 
Miriauee a®, and Unit 

2) + hr, 

wliere ci, Lbo h'« and the j;'a are assumed knowm and whore, without osaential 
loss of gcncrulily, \\q assume a > 0. ;Vssumo further that 

«r 

(3 3) 2v 

/-I 



36 


E li LtiniVNiV V.\u f 


In order to obtain the most imclfnl test of//ii(?amsl//i, ucnpuh 2 

with 


( 34 ) 


ff( 2 ) = G evp 


—— aTij — h,) 






> X ■» 
,r. 


1 ■« , 
i 


The most powerful test is therefore given by (2 1), if \\c rcpliu o r/fj) hv 
This test being independent of thct'siviula > 0, it is imifonnlv numt jorninful 
against thft class of alteinatives obtained from //i liy not s[ieeifyjitK (lu* valium fjf 
these parameter but lestiicting a to be posilh e 
If we drop the lestiiction u > 0, n uniformly most pr»\\erful lost m hiu^rr 
exists, we shall instead obtain the most stiingent test against tins rMiuidi^l i 
of alternatives, using tlieoicms 3 and 4 Clearly tlie envelope power function n 
constant on the surfaces |a [// 5 = constant- Take ns the fl of (lu^irnu 1 , llin 
set consisting of the tivo points («, hi, , a] and hi , »' 

Let X assign the probability ^ to each of the two pomls Then Urn fiuirlm 
of (2 7) becomes 

^ 2;s( 3„ - (w:„ 


i”n 9 


1 / 1 

'’‘p 


,2j= 


f3 5) 


• wj + KvU "' 


If 

b.)} 4- 4* h.)J 

~ exp{2SaaJ;,Zi,| + ^ 

The power of the test <p obtained by substituting this c\piosfuon for d in (2 IJ 
la the aame at both points of 12 For this teat 13 most powerful for (ostuig // 
against the sunple alternatives W that the density of the is «iv en hs Um Im 
member of (3.6), But undei tho tiausformation 2j/ = , 1 - 2ifi // nnd ir 

and tbjetore tbs test ^ aie left iavariant, wbilo the tbo pomta ot J) iirp ixriimta! 
Condition (28) of themem 4 is thcrcfoie sntiaficd, mul lioiwc lUntmiiiant llio 
power over SJ Since fu.theimoro ^ „ indcpeiulont of llio niirlioulnr 
set fl chosen, it follows flora tlieoiom 3 that „ rs most stimijeiit fur Hit. iirol.lom 

Then I<ttand i^S.Th'' 

Therefore the test ciifcenon (3 5) becomes 
(30) I _ I ^ _ 

Some apeoml eea™ ot pioWera 1 mo ot pmticuhii iiunrcit. 

a) Suppose that the variables of tho ^th aroLin full nUi^ m. i 
write for Z,, U,, when 5=1 i ® ?/ * ^ ^WlbgroupH, luid 

i. + +1. = s.) Let ' . 

(37) J = 

(l for ; = fc, + i, , ^ ^ 





37 


Tlu'Ji (1m* Ml(4‘rjiuhvr‘i^ tiIm* In Uic* xjtnuljlcs iiormul distributions with common 
viirijiiM f nml rtin li tlint 


( 3 «j ^/b -I-« 

'i'hf ITltMrirUI iHVoriK^H 


(3,0) 


? fr 



or 

(3 10) 



y (M, - H.) 

h 



(Ui - II.) 


nn-nnlifiK mtrirtiHl In JH)^*lllV(; vidiies or luiL 
b) If wv jijMM'iiiliaM Mill further rmd let ?n 1, we lire dealing with n pioblem 
wltii'h Would roiiuirlf* with Ihe two Kfunpln problem if wo added indepondcnco 
Ui tin* iiHMiiiijjliom of till* hi'putliOfliH (3.10) beeumes |ii il|, the ciiLeiioii 
nuggi'Klisl \fV Ihlrimii jd] 

I ) If iii'^toiid of 7/1 wo wl K, Ei;^ h ^ I for r lai I, . ,7/1 \\earo toating intcr- 
iIiiinKMibihlv vviihin oiuh pair (u,, i/,) agiiiiiHt normal altciimtivca under which 
the 11101111 ** of r. mid Vi art* difTorrnl, tlie diffcicnce being independent, of i 
'['III* ontonoii ! i fi’i - u.) | to whiih (3 10) rtKluocs was first suggested by 
It A I ikIii r (1) 

d I Ah i\ hint oNiunplo k* 1 7it i in the oi igmid problcMw Under the hypothesis 
Hit* Joint (loiiHily uf Zi, ,Z, ih Hymmotnc in iIh s argiiniciits, while imdei the 
idtr'niunvoH ilio '//si arc normally diHliibulnl with cuniinon variance and mean 
0^1 i b 'ilu' orilonon nnhu'M lo 1 (z/ " 2)(a. - S) 1 winch was pioposed by 

Ihliiiun |3] 

Wo (lii'rofort* that Hcvoral non iiaramoLrio U“its which liave been discussed 
m lilt' liltraluro an* most po\\erful oiic-sidi*d nr moat sliingeiit for teatiiig a 
hMiollichiH of iiiMirmnee aguirinL ('ertuiii claAses of normal aUcnmtives, In a 
lat(*r malioii wo hIiiiII indicalo U> wliiiL evlont llicso icsuUs lemam valid if to 
iheai* hvpipttim*H wv add the uasumpUon of iiidopendejico 
Tin* rcniaiiuiig problems will be coiisidcretl somewhat moie briefly since the 


prtKjfH follow' till* uamc pattoni aa in [iroblem 1. 

Pjinm*h\t 2 Tim LOiuliliuim of problem Id) arc satisfied in pnrticulai if 
xn , ^*4 iiri‘ viiUiea taken on by laiulom vaiinldos Xi , - , Xi mid if under 

the alternalivea lUo piar« {X ,, Zt) have a common bivariaCQ normal distribution 
w illi tfl . 'Vo are then concenictl w'ith a piublom related to tlmt of testing 
for idiwiif'e of iiilerchiaft eorrelallon Fur the corresponding intraclass piobleni, 
we 1 oiiHuler rnudom ^ ariables , * * * , Ai i , - * * , 2, j and teat the hypotlieBis 
lliiil the joint density of the 2s variables is symmctiic m all its arguments, 
ogainal llie nltcrnatives Limt tho pairs (Xf, 2,) )iavo a common bivaiiate normal 
distribution, the means and variances of tho X'b and 2's being tho same. We 



38 


E, h tLllMVVrN AM) hTMN 




Shan only oon.id.. tho case of •-;XrSr'“\o--r "ko""’ m I'luT 

S*,.,a.mtheonoB,cledc«sooI .. 

r,:srct=ss=i:trr-i..;..';«• f*-; 

iToS by conaidcme all poaaiblc nay,... «1..<'I. a ..- |.......d 

flip pnTnolGtG S6t of 3s obsGf^^Otl^D®' > , 

T«o“ ConB.dor onca more the lOToll.c,., Il..a I... l-a... • ■•.. i.J 
z, ., Z» IS Bymmctiic 1.1 its a mKumonl,, .iml ronM.li'r Vnll.ri.. ...l 

the Z'a are normally diatrlbutel mth posiUic < .rnda. H-rnU . ..tt..|,....... I (...,. 

(3 U) a(i) = e 0%p 5;-, g [{*. - E) ■" ' 

where!.« = «.. The test based on Una crilot.oil, whirl, wa-^ |>r.ilH....| l.v 
Wald and WoltowiU [81, is therefore most iwne.f.il uKUiirel (l.o «la.i. ..f 
alternatives , , , .1 . 

PaoBtEMd Aaalastpioblcm.ii’cslinll IcstlliolAiiull.P'is// M...' Ili..j..i... 

density of Z,, ,Z. is symmetiic .it its ft argiiii.onlH an-l “.viiiiiif'ii' 

each eooidinato hypeiplanc, that is, mvaiiii.il iii..l..r tl... .. 

J.; = = t, ioi ally i. for . =• 1, , ft. Tina mil Im U^l'd '.miii'i 

theatanativcsthattheZ’sareuwU'pcndoiilly.Hlmli'.vllv disltibntcl itMot.lmn 

toanomialdistiibutioniutlinon-zeioincaii, If \iric»lii..LlliHiii<‘tiii I" |h,-iIiv.. 
values, we get 

■ I.si 


f)J*| 


(3,12) 


»(*) ' 


Wi 




2 ;^ "«/ 

^ r r > * 

If on the othci hand both positi^ o and negalivo vnhics luc tiHinst'd fftr llif hhmm 
the most stringent test 15 based on tho statiilii* ^ ^ j 
This test may be appiopiiatc foi somo silmiUonH in wlueli il I^‘ 
use the sign test 






>4* 


4 Binomial and other non-normal allemaUvos. In fhc prr'Esoat Miinu* 
ive shall be concerned mainly uilh gonoinlisations of problemn Ibl lUid h ^ of 
section 3 As de^ciibcd tliete, the hypotheses referied to the ehuM of nil |>roh.i- 
bihty densities 111 the usual sense. However, aa wu‘j poiiitecl out nl tlK‘ mil of 
section 2, the same tests mav be considered as icfeuing lo inUtdi wuiUt byjn*> lio- 
ses If they aie mtei'pieted lu this way, it la po'ihHvblc to grcftlly witlm llni < laK« 
of alteinatiyes without destroy jng the optiiuum pioiioiUoa of Um 
Let Z = (Ab, , Xn, Tl, , y„) niicl deiioLc by H Uio parliUoti ninlrT 

which tw'o points z nncl z' aic equivaloiiL if they aio oblftiiinidn from mu'li oilier 
by a permutation of coordinates. Lot //g bo llio liypothesiB of invnrinMM* mulor 
n This IS a genwalizaUon of tho bypoLhesifl of coinplolc syrnmnlo' n‘r<trnMg 
to a class of piobabdity densities ConsKlcv ns allcmiitiYo Ibo rlosa uf dintrilm- 
tions defined by 

(4.1) PjZ . A) = C exp [8,Ssi, + »!%. + Mx>) + Zr(y,)t dp(«) 



fjiic jj\i'()riti-‘^fs 


30 


\vlt(‘rr‘ ()k* 0 ^ Jiri' Bn\ rr^il wlicrn ^ m llip 2rtMi po\S(T of uny one chmeti 

Hull'll jni'a‘<ur<* p fiin'I (tirn furr iii^iirriiil midur Ul, anrl ivhera r is any w-measur- 
iiijJr funMiun, uriU lliu furirlilitm (hui tlip rntp^rnl (f 1) converges 

whrii tnkfji hmt Mir vMiutu ^jmer 

Wv ftr'fl ronp'uU'r llir utiu ran* 0j > 0i - Uaing Llieui cm 2 for a partieukr 
Cl tOi, r arid t*, wr Mini liavf 

(f{fj "" rVA|i -4 -l- iKa^,) ‘hl.r(i/,)j 

(42) 4 Ojli/. - J(o, '\-oMxt yi\ 

- HOi - 03)l2Ja:. - Iz/rl ~ 2j/. - ^Xi 

Sinrr l\iii lc«ii ditr;'? not rlt'iirntl on 0j , 0,, r or g, it la uniformly most powerful 
agrtinal ilin rmr buIcmI rhiftK of alienifitives ffi > 0i 
nroppiag Mu' rciilra liun 0i> 0\ ^ wr apply tlu'orovn 4 n the set consisting 
of Mu' Inn jiniiil^ > r, a aiid i?,, , r, ;i At these Lnu points the envelope 

pouiT fiiTKlinii tilivioimly lakeft on the wnne valuo If for X we solcot tho 
(liMnliiilinii, winch H'^ign^ erprnl prohahililirn to hoth ponitSj then 

^f?J ^ rvp 4 <?4 Si/ll 4 c'tp 4 Oii'i/d 

^rxp 2:i/,]| 4 exp {^(^2 - 0i)\^Xi - 2//,]) 

~ I Lx< - Sl/f I ~ I ® I 

'Ilu' power uf Mim lent clearly iH the same againfit holli points of U Since 
fiirMierinnre ihe ((hpl diK^ nut ilrpcncl on the (H'a, r, or n, it la moat stiingcnb 
ngniiiKt //i 

A nnivariale dwlrilniluin sueli Mint 

(4A1 ^'(-V < .11 ®a f Oexp (ffx 4 r(3;)l dvipi) 

Ja 

hna hern eAlle<l Ijnplaeiun Tneodie [9], who has studied these diatributions 
in a different council ion. Among oLliorti, the iiornial and Xt Mie binomial and 
Poifwin thwinbuluniH are Ijiplncnui To obtain, for example, the distribution 
nf n cliurnelcriBlic vnrmhlc. lake for v llto mensure e* which nssigna to a set D 
the viilui'S 0 I or 2 according aa D contains none, one or both of tho points 
X 0 and x «*> 1, mid taka na dejiBily the function 

H.fi) p'(l - p)'"* » (1 - P)»' 

For cmfiimrieon with Icala wbioh have been considered in tho literaliira, one 
can ajiecialiftii the problem jiiflt considorixl, bo that the hypotliesis and the 
clam uf alteruulivcss /Ii consist only of thowj mombeva of and Ih which nro 
generaliKcil donaities W'lth respect to a fixed meaauro // One CB,n specialise even 
further wicl toko aa alternative any subsot of Hi provided with any point Oi,di, r, 
It also contains the point , f>i, r. Tho teat cloaily will not cliange witli these 
fipccittiinaticmB, and the teat baaed on will therefore possess the same 



40 


h U AXU t 


Optimum piopoitie3 i\ith lospecL to ihc^ prijlilpuT^ as vmMi n'ii. - ( tn 

the pioblem for 'i\hich it was ongmahy clerivcil 

li in paiticiilai one selects for v the mcasiuc i-* uvculioiust mem* 

the pioblem for which K A Fishcv puipoM^l the on i* 1 h f! fiillrMpi 

that this teat, Fiahei'b exact teat, is moat atrmKcrtl ni KinJus’inm ^miIi Hu* 
following problem The lanclom -Vi, * • , A’-i. I’l, - , are 

charaeterietic vauahlea, that is, they can take on only (Im wilims0 niui \ ff w r 
let (4.G) Pi/Yi = , • j Fn “ yn| = r(^i I ■ < <hn liviMfht‘us 

that the function P is invariant uiulor all poriniitiUKm*^ of iN arKiupf mK \n 
equivalentfoi'jnulation n that the ])iobul)ililv (4 0) tniK itii r, *1 ^ 

the total niimbei of “successes*’ Fishers oxiivt text jh rno^^l ><rnJiti;»'nr a(j;ruiiHi 
the alternative that the Y’s and F'g aiesampicH from Iwo I pujoihilnais nf 

characteristic vaiiables, that is, two papuliiliony t ornyponrlnifj ih'^hint 
piobabihties of success 

Problem Ic) of section 3 can be exlended quik' iinalogointl_\ Put fin.iin 
2 - (Xi, ' , Y„, Fj, ‘ , Y„), and tleiioLe by U tlio ]iiirl||roM iiinlfT wlinOi 

two points 0 and s' aie eqiuvaleiU piovltlcd they van lie (ibtanu'd fnOii i h rfllirr 
by a permutation of cooidinalc^i in winch onlv the Williin jnira 

(Yi j y,) are interchanged, Coii’^idor the liYPothc^H (if nivuMiiiii o iiinhw U 
with reference to the class of all (listiibnlitnm und na nUeriintivy liiu i1 
rhsf.nhwtions given by 


(4.7) 


P[ZtA\ =. I^C exp ji; (ffiii + 8,J,, -p r(i, , p,))j d H U) 


The e'a here me eny reel numbcis, m h the 2ruti poHcr «f any (inr ,|iiiiinMM(ial 
measure r, and r is any >'-menauinblo funclioii BUeh llna (a) Ihp intrftral 11 7) 
oowergea when A 13 the whole Space, and such IhnL (li) r(j-, y) m tin, ft 
Clearly mtteone-sided case ^ > O, nervill again find ,,, - v ,y 

so that the associated test la uniformly most poMoiful iig«ui«l tlim mie m.lrxi 

t;' i" I. -, I 

,X“;r£;rKi7i:" "ir"? ■ •"r", 

Ecnerahsed densities given by ’ 


(i8) 




transformations -^nd «nder iho Kroup Iry ,l,e 



me tiYi'OTiinsLS 


41 


tnbulion c(]iin]4 ^ The t^jst of IIq againab the alteinativea that Zi, > Zn 
in iL wimpl(' of a flmiafterntio varmblo is baaed on S or | S ^^ | as P(^, =» 1} is 
rralrirlpil to bo grentei Ihim ^ or is not bo lestraited In the fiiab case the teat is 
most powerful, m (lie seooiul most atringcnt, 

6. Hypotheses of invariance for independent variables. To the rcsulLs ob- 
Htj far, ii di/Tercufc interpretation can bo given, which throws some light on 
cfi'lnia retalnii probkins Tlicoiem 2 gave Buffioiont conditions for a tost bo 
bt‘ inoHl i)(n\orfu\ agamst a bimplo alternative J/i foi the hypothesis Ho of 
luvariunt e under u purliUon II However, if taken in conjunction with section 
1, tlio LliiMircin tan lie intepretcHi as giving aufTietent coiiditiona for a test to be 
llio must iioworful tost of structure S{t) with respect to n against Ih That 
IS, llic thoorpia h really independent of the hypothesis, and doponds solely 
on ilia iillerinUivp and on Iho class of tests admitted into competition, m 
our Lhu elaa,s of all tests liaving structuie iS(«) with respect to n, The 
wuiu' inniuk ul>\ uniHly also applies to most stnngi^nt tests 

Ii<'t UB now Consider a special class of parUtions. Let Z afcand for the m 
groups Ilf raiidoJii ^'^lrJablPS (Zn , ■ ■ , Z,t,) (t “= 1, , ni) and let IT denoto tho 

])nrLition uiulei uhicli tno pomls z and s' uie equivalent piovidcd they can bo 
oblaiiKsl from each other by a periniitation of cooidmnles whicli however 
permiiles only (ht* (ooidmuLoa within the m gioups Let fi bo tho powoi of a 
one dunenHiuind measure v, and assiiine that the probability distribution of Z 
IS absolutely vuntwivioUB ith icspcct to a that the iS’a arc independently 
dislrdiuUsl, HU lliat 

(fil) di-W) 

Undi'r them* assmuptions couHulcr the hypothesiH H that /,j is mdcpciident of j, 
that is, ihnt Die K's arc idoiuicnlly dmtiibutcd within each group It easily can 
be Hlujwn that not all adiniasihlo tests of 1/ that have .size e, have striictuie iS(e) 
Uiiwever a general illation of a leaiilL of Fclicr [10] and ScholTd [5J for the case 
711 J and ft ^ f^besguc measurp, slates that the only tests which arc of bizq c 
and Hiinilnr fur //, arc Iho leslB of slructiiro ib'(c) with respect to n 111) It 
fullnxsH lhal any tcHi wbirh is most powerful oi nrost stungont for testing the 
hvpuLhewd //' of iiivauauco under H for tlic class of geiioialiscd densities with 
rpH[>eeL U> p, Inia llio ‘«une iiroperty lolativo to the class of all teats whioli arc 
Hiriiilar for lesliiig // 

Ah an cixample, lake proiilem lb) of Hcclion 3 lloio n is Lebesguo inoasuro, 
m IH 1 1 and no put 

Ui for y =-• 1, ' ' , K 

^ , , , , , 

[TVi; for y an i -b » , A? -f- i = s 

It Hhown m sectiou 3 that the test based on j iZ ^ S |, Pitman’s test, is most 
filiingerit for tpaling the lijpollieaia that the joint density of the C/'s and F'a is 



42 


E. L. LL11M\NK \ND t Ml IN 


symmetric m its I I aigilments ngaiiist tlie Hlternatno llir ^ 

independently noijnally distnbuLetl njth cdmiiion \aritirjM' find 
E{Ui) — ^/(F,) = ij wheio £ and ri uro tiny ciistim t refil iiunilw r« II 

no^v that the same test is most stvingent Biinilii for tf^tuig rtgiiiriNt iJm’ f’In'f'j 
of alternatives the hypothesis that hi, • j C'\ , 1 1 j , I'l ure md)<iiiM4i|i'nUv 

distributed, all lubh the same probability cleiisiU Thin tUn fur 

which Pitman pioposed hie test, und the result juit rtUileti la a nanial 
of the pioblem recently laised by Wilks [12), to determine (he uf fillrni it i\ r-^ 
for which Pitman’s test is satisfactory. 

It we modify the e\ample by takmg for g instead of I>ehf^4gue iin ’inuro llif* 
I + Ith power of the measuve v* of section I, \se are iJealinp; \uUi rh.triif UtihIi^ 
variables l/i, ,Uk,Vi, ,Vi, Wo Imvo shown mrlirr llml if f. - f 
the teat based on | w - i; ] ig most gtrmgont for lesting the hyjxjlhmw nf foinjdni' 
permutability against the alteniativo tlmt the C/’s and V*b nre Sfirnplo# from two 
distinct populations of characteustic variables If w’fi ndd tn thm l»v|wilhwm 
the assumption of independence of all variables, wc oblaui a paniinr'trm jiroldciii 
namely essentially the pioblem of testing equality of iirolialulily of in 

two bmomial populations corresponding to tbo same mimliDr of Irmlt* ll mwv 
foilovva that the test based on 1 m - 5 ] is most sLriiigrnl for Ibi^ proiih-rn As nt 
well known, it is also the uniformly moat powciful, uniuased fiirnilnr 

These two examples sufTice to illustrato the type of rosulL llmt f’lui \HUi\Hmml 

should perhaps be mentioned that tho cquivulcncti dmctn^wl ut llii' 
of this section, can he utilized also in tho oppomic divnclKui TUv fuU, fur 
example, that the test baaed on 1 - D | la lumwn to br unifumdy mt^wl iHiwr-rful 

unbmaed BiTnilar for fcestmg eciuabty of piobabiliLy of succi'iyj in livu imiminlinna 
of charocteustio voimbles Iiom flliicl. tho U‘s and l-g are ,,r.,va,. i|i„i 

this test IS iimfoimly most ponciful imbinscd for lasting lliu l.wHilht^m of 
complete symmetry tor tlio joint Bcnernlizod denaily of tlio U’a ni\<( I'V 

nftv ^ equivalents elaaass. Tim dcfiniuon pf n livt»iillii>i« 

parametric theory ^ ^ "liMtdnr.l 

2 mt f"'.i 

foreachleTletSibeanianmn hi* M Moiva; 7^t .Tim «i,nii siinta, nnd 

the IS, are J^iii etcll™:! “ha ?" 

which can be expressed in the foim ^ «'««» “f »« C, t tf 

( 61 ) 



NON-t>\lt\MLTItlC IIYPOTIII IhS 


43 


itiu] \t\ li(‘ llif' (}{ nil /)o oPCuniriR \i\ suoh irUitionships For cnch 
/ e') h I (rt a siK'i prol»!il)iliLv iiu'iwure ovei Cfi, 'vlieie Cfi is tlio clasq of 
If ‘'iH'li flint If« , 1 1 € Cf Let Z 1)0 a laiifioin vuiiiiblc distributed ovei % 

iiuordiui^ Ui ail uukuMVMi puibalidity mca-iuie F Lrl V'(z) be that ie rffor 
wliK li ; ««S'i, fuifl li'l T » }p{Z) lyf't //o Ilf* the liypothcaifi that foi cucli t eT/ 
Lhu ‘ (ihdilutiiid iluitnbutiui v>I Z given Z i iSV is Gf, i e that theio exists a pioba- 
lllllt^ iiii'a'*iirr n\cr to i^iu h that for all i?l «Li 

Cl a I P(.'i) ” / n a,) dQ^a) 

It H m^n lliJit wo have P^snitmlly the situftLion desciibcd in section except 
tliut llir'rf‘ we ii^Hiiiuxl fiirtlior lluit each jS"f was finite and foi all i, 0, assigned 
is[uid probalulilurt to all poinh of iSi , 

Wo ‘i.'iy tlmt a Ir-sl of /Jo lias stiuctiiro ,S(e) if the coriditioiml expectation 
(if given Z «iSTi satiafiea 

(d3) /mU'' (s)! ^ f fp dG\ “5 i for all t, 

Jh, 

'flu* li'iuiimH tiiid llu*orcms hUiLihI below arc Btvuight-foiward gcnouilizabions 
of ilioH' in MM’tioii 1 so lliaL no proof will ho given 

l^hHMA r Irsl (I'l of slrvdurc -S(«) loil/i respect lo //o is W7n!far and o/ 

itiie t/or //fl, 

IjI-.mmji 2\ // v> fs on// /csl of //q of size ^ tj there ezists a test ^p, of I/o haviiiff 
ilructnrr A'(<) fuw/ /tur/i thftl 

f(l 4) j 'Pi dF > j tf, dF 

for id/ prohuhfhty vu'osures F, for which the cmdtlioml dislnbuhon of Z given 
Z IS, ifl almluidy rontiuuouA wlh respect lo Gt foe all t 
Siippoao noxL lliere la tlcliiied aiiotlicr parlitioii of % into seta (/Sil by means 
of iL Hjnu'O 'df, luid lot (Ti, iJ\ and Ofi, refer to this second partition Wo shall 
iiHi^vimv that for every (< T, n t 'dt cither iSl, c iS, or 0 S\ la empty Let G1 
ho a Hjna'ifitHl probahilihN meaainc over Cfu and aiippoac that for each t e JT 
hcii' oviala a prohabililx mooHuro Gi such that for all A| e(3< 

(nr.) o,(A,) - I oi(A,ns'„)d(3,(«) 

If Hi donotwi the hypolhcsia Utat for each u t 'dl the condiUonal distribution of 
Z givi'ii Z i iS'L Jft gL , wo can atiito 

TliKinni l' Foe testing Ih against !li at level of signiftcance «, l/ie toiahl}/ of 
kKln v> ui/iifA haar Hlraclure *^(0 and/or -lo/iicfi z, e e iS„ implies <p(?) = ifi(z ) form 
an fSseuticiUi/ complete class of admissible tests 
Let Ft 1)0 a diHLribulion not in IJo, and tor each f«f/let Go be the conditional 
distribution of Z given Z t St Wc suppose Hint for eacli t e ^T, Gi, is chosen 
to bo a true probability measure, ivluch is possible in most cases of practical 



44 


E, L I.EIIMVXN AVD C’ flrH\ 


inteiest (seeDoob [13] for a discussion of tin*? iiOinU. ’v^i* ]t,i\c (lu- 

alent of tlieoiem 2: 

Theohsm 2 ', Ld 

(6 6) 0ij(j4() ^ [ ffi dCft 4" fVii(*4( n Jli) 

for all AfC St f vihefe tn accordatice iD}lh Ihc Itadon-Xilodyfu 7‘h^&rrf7i {1 ij, g, 
15 a non^ne^ive fitnclion mlcffrablo oocr Si, (irul }!i Cl Si httfi (ft nirfimrrr f) nn-cf 
docs noi depend on At Far teskng He cigaiMl H\, a moM pG\rtr/ul UH rt/ msr < 
u ^en hy <p[z) = for zt Si whre 

I t/ if. Hi 

1 if OiOi) > C| 

fll if Pt(2) C| 

0 tf gtiz) < c, 



where oi and at are so cJtosen that ^ has slrneture S{i) 

Theorems S and 4 require no modification 

Aa m the casB of finite eqmvftlonce cla^ Ihc rewilu juM. tmlhin**! esm 
interpjeted differently Again the theorems are really iiulejH‘Jtdeiil nf tlie 
hypothesea, but depend only on the allenmtives and on I he i'Ijirh of 
into competition This class of tests v> 16 m the presenl ea^e dt (iv nnidit um 

(64), that the conditiona expectation of ^ giicn Z i H, Gr|iuil« t Ihjt linn m 
just the condition which m the sttindnrd iipproiich to Dio proliloni of l«.lir)K n 
composite parametiio hypothesis tor which T is s Bufficiciil Btiilislo', l,s [in.,uts 
of siimlsf regions IS frequently found to bo the ncrcsBiiry mid sufrifirm, riindUinii 
for V to be similar, (See for mamplo |ls)). For tlicM oamu. liioroforo ibo 
ypotheses of the present seoDon represent non-ptirnmetrir lUmloKiios t.i wliirh 

!e!ir r properties Imt wiilumt, i)«> „ ,,r„.ri 

restriction to araniar regions ‘ 

As a simple dlustmtioii of this remark, let 2 - (Z,, - , z,), «,„! 

r * S 2 ; For the conditional distribution of Z given T =. f udte lire ijqin.rHi 

^ Tlimi 11 ,0 

yp if states merely that the joint probtibilily density of D»> Z's ,« .1 

function only ot g 2,, If wo add to this the iieauin))lion of inilfp(.in|pni.q „f 

the 2’s, we obtam fcho Hew hypotheain i-hiip . 

normal distribution with mo memi Thn i-ne-f ^ fmin n 

expectation ovei each sDlieio is # t', i "‘lueli lUe cqutliluiiml 

sndtheenlyadTmssibieBlilnrtestsT/r If 1 ?^ t««W „f // 

the Z. are a sample from a normal diil'. t." :rrirr> " ZZ' 

(6,8) ii ^ 



iivi'ouiKStH 


45 


h uiiiinriiilv inij''! [jfjwiMfiil fur ll /inti niiifutmly most powerful similar for 
If wv flu iitjl rcHini I {lo positive vuliics, the test 






Slinhut'H iv«l, \H iiuiftiiiuly powerful unbiutiecl and moafc Btvmgont for 
U'SHiig //, ttuHl poworfiil imbuuKid aimilae uad most Btringcnt aimilar 

for KhliJin //' 


JtLI'LRCNCLH 


H) U \ J jniintj Ihiign of Lipmtufnis, (Jlivor and Uo>'d, Edmburgli, 1035 

itlj J NnWAH, K IVtAHKlEftU'l AM) St Ivt)l-Ol)/UEtZYK, "BLtlllJjllCftl problcTtiB m Bgri- 
riiliurfll fxpcrimontfllioii," SiaL iSoc dottr, 6'jtppi, Vol* 3 (1035), p 107 
(3) J fi I'lTM AS,''Kigniflrftncc iciili which may bo Applied to afimploB from any propor- 
Dtin, Rny MftJ dour, A'uppl, Vol 4 (1037), p HO; II* Tlio correlation 
ctM'flicioul le&l, f(o!/ .^(Qt Soc /oiir , Vol i (1037), p 2Ml III The 
aimJvffW itf vflnnnco imli ihowictnAfl, Vol 20 (1038), p 322 
lit 1, h I^HIMASU ASIJ V STms, "Mcmi poncrful tCBlfl oi conipoBito liypolhoflcfl I 
Nuniml diBlrihmiona," dnnat* of Malh. *Slol, Vol. 10 (1043). 

I'll H S 111 rH&, ''Oil n ninwuro problem nrifliiig in the theory of non-pnrainoLrio teetfl," 
[iirinfji (if Mafh ^ml , Vol M (1(113), p 227 

jrt) A Waiip, '‘An Hmendiilly cmnplplo rlnan of adiiuff/tiblc dccieion functiona/* Annote 
(./.RrtJh Mq(.VM'lS(IOlf),p ^10 

[71 (i III ST \sn t' Mill IN, "Moil BlriiigPiit Icfllfl of alatiatical hypothcBM,” nnpu5Mecii 
(Hi A. WaI4) ami J. Woi.KU^iT?, "An exact Uat for rniidomncM in llio non^paramolno 
raio, Imard im flonal correlalion," Annolfl of ^/alh 5loi, Vol H (1048), p. 378. 
(91 M C K Twn iiii, "runelmiiH of a flt/iliHticftl variftlo wilU given mcana, witlupccial 
rrfrrmice lo Lftplnrmn diBlriliiiUtinH,'' Com Wi/ f>oc. Proc , Vol 43 (1047), 


lift] W I'Ki.ivh, "Note on Tegionii mmilar to the Bampk Bpace/' StnC. i?ca. ilfeinoirB, Vol 2 
(1038), p 117 , ,, , , 

111) 1, L LuiMAH/f am* H .VuMKtj "C'ojiipletoii 08 «,Biin]lnrrcKioi]BnndunbiMed oBtimn 


Amt 


iWi/' impahhilief/ 

(l21 S H. WiiKtii "Order Bialialica," Ant. 64(1018), p 0. 

|KI1 J L Duon, "Asymptolir proporticuof Markoff trnnflillan probabiliticB," Trans 
l/n(A iSV, ^'ol 03 (104fl)j footnoUi p 309. 
lU) S Hakh, Theory o/ Jhr fnletf/Hl, blechcrli 1087. i 

1151 J XrYAiAV AM)li vS re A nm. "On I lie problem of the JiioBt efficient IcfltB of Btaljfltlqnl 

hvpotlu'aeM," Rof/ hn'^mPhit VVenJ , Sor A , Vol 231 (1933), p 289 
lini A Wai II, On (he Pnnapks ojStalisUcallnfcrmcc, Notre Dame MalhomallcnlLoclnreB, 


Nninbcr 1 



ESTIMATION OE THE PARAMETERS OF A SINGLE EQUATION IN A 

complete system of stochastic EQUATIONS' ^ 

13y T, W. Anderbom’ and 


Cotaka a?ul /nsmulr/tir 


1. Summary. A melhod is given for estimnling Hir' if^nH nf n jnngln 
equation iti a complete system of linear npiilioiiN r-xi.n- 

(2.1)), provided lhat a number of llio coefficiputs uf (Im 1<^I nnf' 

known to be zero. Under the assumption of the <*f all in 

the system and the assumption that the dialnrbaner.-t in Hip Mf Uip 

system are normally distributed, point ostimatca are dcrivp^t fr<iin Hn* 
of the jointly depondent variables on the predotormmetl vnrinbto^ I f. 

The vector of the estimates of the coefficients of the joinlly ilepRiifh-nt viinabh^ 
is the characteristic vector of a matrlv uivolving the 
and the estimate of the covanancQ matrix of the rcsKluftls frttin th<‘ rvgr%-vHKm 
functions The vector corresponding to ihe snmheil rhnr^i^'lpii^liv ritiH 
taken An efficient method nf compulmg those oslimaloH is k\\ mi iii tnui T 
The asymptotic theory of these ostimatca is given in ft following jiii|>tT |21 

When the predetermined variables can be conmdorotl tu*! i^mlidonco 

regions for the Coefficients can bo obttimed on the bnaui of wtinll mnnih* thoorv 
(Theorem 3) 

A statistical test for the hypothDsia of over identiripfttinn of iho mnglo iH|iiaH.in 
can be based on the chamctorietic root aasoomUKl with Urn of jwmt 

estimatea (Theorem 2) or on the expression for the ftinall namplo coiifirionro 
region (Theorem 4) This hypothesis is equivalent to llio hyiKithosifi: llinIL llin 
coefficients assumed to be zero actually are iioro. Tho osympuilic diatrilmtiMti 
of the entenon is shown m a following paper (21 to be LlmL of 


2. A complete system of linear dlfierenco equatlonsi In man v fieMd f»f nlinly 
such as economics, biology, and motcorology the accurrenue of vr|i«*'s uf Uio 
observed quantities can be described in torma of n probability mqcicil which, ftH n 
first oppioximation, ig a set of stochastic equations. Goasidot' a (n^v^) \ i^UiT j/t 
of quantities which are observed at timo I Svipposo that lh<w qnimliln^ w 
pintly depmdeni on a vector z, of quantitioa "predetenninotl" at tinif* i (i.e., 
known without error at time 1) Some of tlio coorcliimles of tt may Iw ecKirtiinAti^^ 


' ThiB paper will be Included m Cowles ComralBsioti Papers, Npw fkirips, Xo 

wWr proBontod at moollnKS of llie IiibIHuIr «f \hlh^m«Hral 

Cli«l.ier) >..d In lil.nr., N 1 . 

BenSXo^oZr’' ot .1,, Conic. Ooin.ntein., for rt«. 



>siJMvriii\ OF riHAMtTrjia 


47 


of f/j \ ,Vi 2 , hiortlinuli'i z, iiro qiintititiG'i Avhicli arc aasiiraed given 

^ I of \('i Uirrt yM — 1, 2, , T) aio called endogenous The 

pari of Zi v'.hu h dues iini I'miMst of eiidogenou'i vnuablcs is called 

r/nyrnous, il.i^r arc (realwl us "liM'd Minutes” For convonicnco we ahull 
lliiiik of ( as iiutHuUuK poinl of tiinn, allhough il may m many cases indicalo 
ili(‘ ordi’riiiK of a in imotliei' ihmonsiou, oi, indeed, the L may indicate 

Kuiiilv u luuulK-riiig of the oliservaLious (if Zi m entirely ciopiwioiis) In n 
ilymonio ecnruiniK' mwlel Hit* cndciKenoua viumhleH are economic quantities 
uur II us nmmiul <d uivi^uUueril, iiiteroHt rate, amount of coiiiiumptiou, etc The 
axogciimm Mirmhh^R am lliose quantities winch arc coiisicleied to bo determined 
pniiianly oiitsido ilie economic system, such as amount of rainfall, amoimt of 
government iMipcmdilures, time, etc 

A simple pnjhftbility model may be set up on tlic assuraption that these 
quanlilics apprcLXimatcJy satisfy certain linear equations. Specifically the model 
is 

(2 1) BwU't d- Tv.sl «! 

where (i w a (row) v eelor h{V\ mg i\ prohaluhly diatiibution with expected value 
aero and and nre nialnccs, Iho foimei being non-singular Primes (0 
indirale InmspOHdum of v’cetora imd inalncca If Lhovo nro Q jointly dopondont 
vnnahU**v, there art* 0 couiponouL equaLions in (2.1), that is, theroare as many 
c<iuntu)i\h iw there are vanahles do]iendinK on Ihe system. The fact that yi 
and Zi lIo not aali'dy linear equations exactly is indicated by sotting the linear 
forms not equal to wru, but eciual to random elomontH, called d^aturtancca, 
We will call llie eumpunenL cquatioua of (21) aiiudiirai equations, for llioy 
express I ha struelnn' of the systeni For example, ono equation involving the 
aniounl of gomlh coiiHumetl, the prices of these goods, tlio size of the national 
income, ete , might titwribo the bcliaviour of tlie consumors Anothoi equation 
involving iiiterost rate iniglit relate to the behaviour of investors 
It IiiiA iMHm fcJiown (7), [11], that m gcnornl one cannol use ordinaiy regression 
ineliuKls to (ytimale the inatneos Jin and IVi and the iiivmmeters of an assumed 
dislnlniliori of iho dmturbancea Iilann and Wald [0], for a special oUsa of 
HyxleriiH, and Koo])niaiifl, Rubin, and lAUpmk [11], in a more general cago, have 
ohluirinl moxiinum likelihood estimates of all of the paramotora for the case of 
Iho fi luiMiiK a normal multivariate dULnlnilioii 

Jf^ >« Jion-amgular, ivo can rovuiLo (3 I) in a chffeienb form, called the 


m/werd /tirwi, 

(2.2) 

y! sas -f- 

or rid 


(2 21 

ut - iw; -1- Ti! 

wheio 


(2 4) 

Rv* ^vv i VI 1 

(2.S) 

f ri-l ' 

rf( « 



48 


T 


ANDUll'^ON VNJ? III HIS 


If e. has a normal distubutioTi, SO does fj,. For a rivou nh-^n, vu- pam ron.alpr 
themodelasapecifyingadiBtrjbutioaofi/,n]tli conditionalv.dim .,11*. 

It 18 clear that w can multiply (2 L) on Llic left by non-iDKolar inuhi^ 
and obtam a system of equations ^\hich dcftne^ lUc Karat* tliMnbnUMn rd i/, 
On the other hand, it has been slioivn that the only Imnsfonnatrcm i id 
which preserve the hneaiity of the ayatom of equaltoria ore mijlli|iliratu.,iri foi Mm 
left by iion-Bmg;ulQr matrices If tlieio are a, piiori rcHlrirtmiia nn 
the aet of mabiices which result in now coofTwicul mntritt**^ nali^fvin^ 
reatnctions is correspondingly clccrooscd. If the koL of ndmiMibh- mairii' 
multipliers includes only diagonal mntncea Lho a.Vfilem of alnictiirnl (*»|u-aioriH 
la said to be idenHfud. In this coso only multiplicalKui of sd) r^>pJhr‘niiK bv ti 
given constant is permitted 

linowledge of the distribution of yi given Zt ig obvioiinly pquivalmil bi kiiiiwj^ 
edge of Uyt m ( 23 ) and the distribution of i?i. ’W lien the ik lifentifirvl, 

the matrix and 

(2,6) Pi/f ^ 

are determined uniquely except for muUiplicaliOh on llie left b\ n rlm^iimd 
matrix. Thus identiftcation of a Bystern is oqra\ftU'ut to the ptKHiimluv nf 
inferring the structural equations from knowledge rd the fltHlnlnilion 
estimation of all coefficienta of arid I'vj lins bwn roii‘<idi‘n(*ti in |n| 

3 A single identlded equation of a complete aystem, In lU'itiv htnr|n<>>^ (tin 
investigator may be interested only m a specifu’ (‘i|unLujn (if lilt* *Kywli'iii, luiy, 

(3 1) i 

where is a scalar disturbance The mveatigatur mtiy noL ht* iiiU*ri*s(i*<l m llm 
entire system (21) of which (3,1) is one component .Sincr* ti rtniwi<Jt-rnldp 
amount of computation is necessary to eatimaU* all parnmelntN of a MHiiph*io 
system, there arises the problem of estimating only tho riM'irinoim of n niiiglo 
equation It is desirable to do this with the least possililo nwliiclii n nK.MitU|Oirius 
about the part of the system which is not the soIocLihI rttnictiirHl c*<|Uftli.(ri In 
order to treat the selected equation at all, we require that iL is idnndficsl, ilml is, 
that there are certain restrictions cm (/?i,, 7 ,) such that mi Imenr cornliiimvion id 
rowa of {By„V„) satisfies these restuctiona other limn a cuiirtlma limrin . "nl 
It IS not necessary to assume that every component equal ion in Jdcnliliwl, th'U k, 
that the entire system is identified 

IVe shall suppose that the restnctious itnpoMjd m* Uml certain V(H*HinnnH 
are zero, We can auange the components of llio vectors ho that (lit> rrHtnrlnmrt 

(3 2) 
where 
(33) 


iP,, r,) = (fi, 0, r, 0), 
^ =■ (s', • ■ , j3”) 



jeSHMATlON OP* PVUAmTntia 


49 


1ms // rcK'nicirnts nt)t nsfluriiwl to be zero and 

(3 0 7 ** Ct\ • " F 7^) 

hflf* /■' r^M'ffiriPFilrt not ftasiimcd to i)o zero 
It will be rui]\niiionfc lo divide' tlio 0 componeuta of into two groups (in 
nuinhrTi/ nnd G ““ Jf, Tt^]H?r'lively), aiul the K componouta of Zf mtoiwo groups 
(in mimber F fttid D rt^fipoclively) according to whothor or nob the componenfca 
enter lut^i (3 1) with coefTincnla not aaavimod to be xero. I/eb 


(3 f)) Vi ^ {xt I ^'i)! 

(3 6) sTi '=^ (^1 1 Vi)j 

nlif're 

(3 7) ^1 ^ (^11 . ‘ I 

(3 8) f*! (>*11 t '' * iri.g-u), 

(3 9) O'li I ’ ’ > 

(3 10) “ {^111 ' I 

Then ihe wIocUhI (Hpialion is 

(3 11) /5^!-b 7«I ® Tf ■ 

Now !i*L ua mi how the identification is accompUfllied. Parbitioning !!„, into 
// ami (/ “ // row a and f luid D coluinna oa 



we dan wrilo the rcduccrl form (2.3) as 

(3,12) “ hIhUj 'b HmuI 'b ) 

(3 13^ r'l ^ rirtful + Ilrhiii + t/i 

whore 

^1 “ (5i r fi)' 

Multiplying the above eqimlitm with ifi, 0) wo obtain 
(3 iq) 0x\ 'b + )35 j . 

Bmeo UuH iftviftt Ivc klontical lo (3 11) wo must have 
(315) 

(3 IQj 0 ^^Hx* * 

The mftlncos n,*. and rf„ art dolined by tl.o d.atribution of 8™'' 

(for at least K ^ V -h linearly indopondonfc values o! Ui, v,) Ihe equation 



50 


T W AT^PDIISON AND HhlO'"iN lU’HlS 


(3 11) IS identified if and only if the solution of (3 15) and (3 \U) fnr ,i hu-1 y ^ 
unique except foi* a constant of pioportionality Tins iiii llii* laitk nf 

n,i, being if - 1 Thus a nccesmr)^ and siilht'ient cunduum O lU is 
identified is that the lanlc of xi oti v, be // - 1 fn imiLicular thn iiiijilu^»timt 
the number of cooidinates of Wj (the nunibor of w'ro (‘oclliciPtilH in 71) Uc nt 
— 1 It can easily be shown that tine condition is cxiuivnlcnl t^i ri^iiiiriiiK 
that the rank of the matiix obtained by selecting iIib 0 — II cohniins r>f 
and the D columns of coiieaponding to the coefliciPiUH iihwuiuhI iw*ro iii ili<^ 
selected equation la (? — I, This is the conilitioii given by Kouimirihn {iinl 
Rubin till. Other liomogonoua Imcar restrictions can bo put in lIiih fiirm. 

If the vector ei is normally distributed witli maaii zero Iho vccUir u, h nonnuUy 
distributed witili mean zero. Lot the covariance inotrLx of 6, Iwi lb* Tlion 
the variance of is 

(3 17) 

The constant of proportionality in jS may bo dotorminctl by Betting llio vArinrico 
of i'j, ir, = 1, another normalization is 


(318) 

where IS the ^th coordinate of 0 In general the Jiormnlisialion can Ijp writl-Pri nu 
(3.16) 


where can be either a Imown conataub or cau be ft known function of vmknowti 
parameters 

As an esttmation procedure lor p and y and p » M A. tftrivrliipk 

suggested in an unpublished note that one solve ecjuntioiis (3 15) niuJ (3 }tt) 
mth (II,„, n„) replaced by the sample rogroaaiyn of x <hi m ntnl y. 

By th^ means Girsohiok found oonficleiico regions (sen aeclion S) for llio 

0 RdeS 110 “ ^ 


The pre^t paper develops a method for handling Oie caao of 1) > // In 
this case the rank of P„ ^ usually H, thiia giving no admissible wilimalo of rf 
The proposed method follows the approach used m disorimiiiant iirolileiiw, 

In a second paper [2] the present authors shall give aayroptolio proporliea t.f 
th^ estimates that give a certain justifloation for tim uao of Uidii 1 mlrr 
eiy general assumptions conoornine the a, and tho <, wo pr«ive Hint llici«< 
^bmates am consistent, Theso hypotlieses peimil the mvoslLlor b, do 

distabuted, or even that they h^ve idenC Ittlionl 


nf the vfriX" 



YTIO.N (Jl l'Al(\MUTLI|<i 


51 


|)\ iiijitnv flljsu IT,,) fri coonTicicnh of xt on nt and y, Tho 

pul('rdrp(*ndynr<‘ of (lie cooHhnalos of a:, indicated by the selected equation 
mdlirii'H ibe dependcjiro <tti Li , tlmt \% 

(d n fillr. 0. 

SiipiHi^ we wikU Ui ej^tiinalc d and 7 from a sainplo of T observations; 
(/i £|), (/i, *2), (-Tr, Zr) ^llio infonnation w'c need can bo suinitiarized 
in (lie ‘y’omid nntlor moment matrices 



Sineo one CfKirilijiato of Ut may bo unity there la no advantage in taking these 
moriienla about the mean We slinll find it more convenient to use instead of 
L'l l)i(‘ Tmrt of Or that is orlliogoiml to n,, that ifl, wo slmll use 

ObG) fli =" el - Mru^f^Wt . 

The monienlfl aro tlicn Ma , il/j.u > , 

(d 0) , 

and 

(1,7) il/ji > 

Wti can exprcfiS tlie 1 educed form as 

(■ 1 . 8 ) x ',» n„«: + n„«; + s '., 


where 
0 0) 


ii., » ii« + , 

a., “ n„. 


An owLimato of ir,, is tho regrosaion of x on b, 

(dlO) 

To csLimnUJ 0 wo take the 0 that makes 0Pti smaUeat m tho raetno dotermined 
by tho moment ii[iatn\ of tlio icaidunle 

(■1,11) Wit ^ ~ Pjit^i'ttPti ~~ Pfu 



52 


T, ^V. ANPEII^ON AND IlUlll'' 


r-l 
UU r 


where 

(412) Ptu ' 

This IS Ihs mtufUl gensialisstion o( loMt sqnnics, ih-' ''' W' f'! 

to the oompoaent with lenst varmnco This csUniMf h Hie mwI.t MlisfyiiiR 

(413) ^ 0 

which IS associated with tlie stnalleit root of 


( 414 ) \p.,u,.pL - pir„i Os 

This la normalized and the eaUmtvte of 7 ia '"(•Pju > 

In secUon 5 we derive these estimates by the method of mAXimimi likrliUfKW 
under cettein osaumptiona. Although it is ftssumed tlmL tlio dwturlianrr'S are 
normally distributed for this derivation, iho eslmmleH can 1 k‘ m mcirc 
general situations This theoiy is in ono sense a SfW'cinl of Urn t|ic<iry uf 
estimating a matrix of means of a given dimoiisionnlitv 'Odrh m mi (‘xUnimon 
gf the discriminant function theoiy [ 5 ]. For An appliradon of till*! iiiuIIum! of 
estimation sec [6] 


6. Deiivatioa of maximum likelihood eBtlraBioa- We derive llie gq^Urafttca of 
j3,7, and a under the following assumptions. 

AaBu^tPTioN A. The sejecied efruciurai cguahon 

(3 11) ti 

IS one equahoti of d cowpUte linear fii/a£ejn of G atoc/mahe r^tinfioiia. ir^Mrt/ion 
IS 'idml'ified by the fact that if R i$ iho mnnbei' of coerdino/ca i?i ri /Acre* orr fi( /m^f 
^ - 1 cmdinales in iii, the oector of prcd^/craiwicd I’ariflhfc^ not in (3 iJI} hi/t 
m the system 

Assumption B At tme t dll of the coordinated of Zi «=> (Uf, iv) are> yiMt. 
Assumption C The coordinates of a, are gwen fuucUoits of cxopnwu^ vonohks 
and of coordinates of , i/i^i , ' // coardinalfn a/ . j/-1, • * ttr^ iuiolml 

vfi Zi f they iinK be considered as gimt numbers The moment motm mm- 

siriflutar w</i probafnhty one 

Assumption D The disturbance ucciora fij arc diflfnhidcd 8frmii|/ wiqynuRnily 
and normally wth jneun zero and cooanarioc molnx 0^,. 

lYe shall consider normalizations fS.lO) wliero nmy hu a fuiiettuii oUief 
pararoeterg, but 

(51) Oi 

We can state the reanita in a bhearom* 

TnEouEM 1 Under tMaiimpbona A, B, C, and D the ifioxu/inin fiAffi/iood 
esiirnate of is 

(5,2) 


i *= I>/Vl 4 ,xb', 



Mhrrr h M th^i' ^Mniton nf 

a 13) 


OF PAMVMl'rrLrifl 


53 


- pir„)5' ^ 0 

rorrnpimdvw lo the mnUcdl t-nluc oj n mui P^t df'fimd hy (4 10), M„ by (16), 

and It M by (111) -In rtthmalf of y based on the viaximim Uhhhood cshmalc 
rt„ ifl gi\ni by 

(5i3) ^ , 

H’be'rt' Iff hi/ (1 12). T/ie estimate of a is 
(5 1) (1 + v)/54>„6' 

i/ 

(5 6) 5tV6^ ®* 1 

We npplv llif* mclhod of maximum likelihood to 

(flO) h .. (2»)''”in7.'|'’‘oxp/-ii: (*, - 2 ,n;,)n7.‘(®i - n..zi) 

midof Ihe (H) and (3.10) llppltvchiB ui by at and adding (41) 

and (3.10) rniilliidiixl by I-ngrange multipliora X (a vector of P coordinatoa) and 
<#i rewjM'rlivelv to the loguriUtm of L wo obtain after division by T 

/I iLi - y/ log 2jr -h J log I fl7< I 'b -h (^((9»I'«)3' - 1} 

- „!/! T. (x< ~ M,n;. - <,nL)n7.‘(ii - - n„«;) 

AX i-^l 

DilTerenliatiiig (fi.?) with roaixiGl to wo obUun 

{5W "'i - ii„v + m..?' 

OfS 

Setlmg lliifl equal lo zero and multiply mg by jS, wo have 

(9n.,\' + 24>^>a' - 0, 

By virtue of (t 1) and (3 19), the Lagrango multiplier 4> muat be zero Hence, 
AA far o« the dorivAnvea of (5,7) are concerned tlio restriction (3 19) does not 
Ollier. ^J’lit* nellmg of (lie tlorivaLivoB of (6 7) equal to zero and (4 1) will define 
^ ovrept for a coiifllant of iiroportionrUty which la finally determined by (3 19) 
For roiiveiiiouca in deriving llio eHtimalcs wo aliall use the norm aIizaL ion 

(5.9) « L 

The dcrl\al^^'C8 of (6.7) with respect to the coordniates of i2« , n,g, B*,, and 
/) mo Aot Kpial Lo zeio, reanlLuig in 

ft., - A{.. - 


(S 10) 



54 


T 'W.* A>?DHnSt>K AND lltRMW 1(1 III\ 


(5 11 ) + ii'\ “ n, 

(5,12) - ll,uiV.O = 0, 

(613) fl„V = 0 

Solving (512) loi Qsu, wo obtuin 
(5 14) - P„ 

defined by (4,12) Solving (5,11) tor Uu i wo obtain 
(5 IS) fl.. = P„ + ftJ'iiiVr,' 

Multiplying (6 15) by $ and solving for !i, wo obtain 
(5. IB) k e . 

Substitution mto (5,15) gives 

(517) = (1 - ft j'^)f „, 

In view of (6,14) and (5,17) we oun write (5,10) ns 

(518) ft» - ir„ + iU'|5f’..A/'„Pi^,5ft„. 

Let 

(8 19) ^P«lli..P'J' - p. 

Then multiplication of (5,18) on the right by j}' witli uso of (51)1 given 

ftj' = Wj' 4- {tJ'M^M»P'J' 

=- IM' + 

mat 19 , 

1 — fl 

Equation (5 13) can be wntton aa 

PuM.,P’.^' ^ pflj' . 0 

by substitution from (5 16), (617) and (6,10), Combintog (6,20) and (5 
obtain 

(PnMoPi. “ vW'u)^' - 0, 

wheie 

For (5 22) to have a solution, v must bo a loot of 

I P«M„P'„ - vir„ I « 0, 

Substituting from (5,20) mto (618) wo obtain 

(624) iit. = W„ + ii (~—Yiy iSiaiif ,,, 

\1 - p/ = Ilr„ + ^(1 + ,)l|r„^^„i 


WO 



OP PAIUMLTE^S 


66 


'in (h'tpriiimp \h1imIi rmil of (111) to use we aliall compute the value of the 
hkclilupHl fniirlioii ^^lll ii (hc'it' tisliinaloH aie used, It will ho convenient to use 
llic Hiliituui It uf fl id) noniinliz.Hlion (6 6) Tluis h is proportionol to 

lit flK't* •illUP 

1 “ fX 

from (fl ai)i, t\i' WH* that 


JjOt llic oilier nolulioriH of (I 13) be 6a, ■ ■ • , , with corresponding loota 

, • , j-w j nJid 






Since 


(5 25) 

1 1 “ 1 K. + fW.JbW., 1, 

wo Juivo 


(6 2fl) 

1 //• II fl„ [I «*' 1 » 1; + vB'W.JI>'bW.,B*’ 

Binoc 

b]V„li*' - (1.0. ... ,0), 

anti mnee 



\\Q deduce from (5.26) 

I fi« I ® I I (1 + p) 

Afultiplying (5.10) by j tlio tmeo, niul substituting m (6.0) we obtain 

(6 27) /. - | ]r« | "‘"(1 d’ vT'^. 

TIuh 18 u iTi«\inium if v in Llio Bumllost root of (1,14) 

Tlio llicorom now results Tlio oxprcsaicm for cr^ follows from 

If 18 a knotvn constant matrix, 4** = , if is a function of the param- 

otors, »f»„ is the same function of the estimates* 



56 


T, \V ANDEHION AliO 


If we define 
(5 28) 




♦ 

(M I 


^ p= "" uu)' 


we have by (4 9) 

(6 29) 

Since ^ annihilates , (5 3) reauIts 

The estimate of n.. is given by (6,17) And llio esUmnlo lit fl„ la 


(5 30) 


fijr =’ TKh vWub'hWtt . 


6. The likelihood ratio lest of restrioUons, It has Inmn iwnim-'il tliat Ihr 
seleeted stnietural equation is .dontiM by inuminR Iho 

coeffieienta ate aero It was noted m Seetioii 3 lliul nt lenal (I - I m r i n-ilno- 
Uone are necessary H D, the niimbcr of rcstriotiuna on Ho (i^lvlnrmiti.^ 
variables, is more than H-l, wo can tost Llio liyinitlu»» ‘bat lli.w 1) f.mffirinrUs 
are zero Bgainat the nltcmntivc that only n srnnllcr tnimbcr am rtir.. Him la 
equivalent to a test that n,. is of rttiili J1 - 1 ngaiiiel the nllnninlivi* Hint, Om 


rank la H , < » . i i 

It can be seen mtmtively that tbo BtnaUoal ruol r of (1 11) 

P« is to being singular, Tliia statiatic ciiu bo uw(l Ut I Ik* lij Uinl 

IIiu J 3 of rank H " I Tlio toab la almilnr to tho lest uf rank liv P L 

Hsu [8] The test js stated precisely iji the fullcuvin^ llir*<jrem 

Theobem 2 Undor ossutJipHojis A, C, ond P folw rrjfrriufi 

joT Icshng the hypothesis that of tqhL // ^ 1 the nllrnwlnr ihffi U M 


of Tank H 'IS 


(61) 


(I + v) 


whei‘s p the stncdlesl Toot of (4 14). 

Pkoop If there js no icstiiction on Hif, tlio niuMinoai likt'lilUKitl of 

Hii iB P„ , of II„ IS Ptu I and of ja ir„ Then the likelihouii fdiielnm i« 

(6 2) (2^0)'"''' I IK„ I 


The ratio between this and tho likeliliood function (5,27) intvsuniiWHl uuilt*r Ihc 
hypothesis that the lank of Hzn is // — 1 is (O.i). 

It la proved m the paper following the present oiio that under cerUiiil fohditionn 
(more general than those of Theorem 2) 

(6.3) -2 log [(1 + ^ T log (L d- p) 

is distributed asymptotically aa x“ with D ^ I degrcca of frcwlout, TIiUh 
an approximate test of significance le given by comparing ((L3) with u aigiulu‘au ('0 
point of the jc^-distnbution with degreea of ficcclmn eQUftl to tho oscCiS^i numbt^r of 
coefficients required to be zero (i e, the number beyond tho inininuim ro^pUMJtl 
for identification) 



iMJMvrJuN Of I’viUMi.ri ns 


57 


CoiuputflUonal praceducc* 'ilic csUmuUon protecUnc sn iioctions i niid 5 
(lot'^a nnf MUlifuf4' lli(' ijin^t rflirii'iitincdicjcl foi t'oinpuLing tliosa csLimtttea The 
procnhirr linn* jh licliovcsl (o he plhcicntj for ordintuy computuLioaal equip- 
niriit iuul r'lii iMHh hi* luliipU'd foi ‘^cquciU'e-eoiiLioiled cuniputing macliinea 
Let UH Hs* \s)iuL cvpir^sjuiiM oociu in ihc eslimiiluiii pioceduie foi & and 7 
We hnd lliivl miH lutil kmm , ami Pau, these will auffico 

if <1**1 iH toMrilant or iiiL (0 estiinalo 7» luul it* In what follows, w'e shall 
ftRHiinip the iiorniidi74i( 11111 jh « I, tia the lesiilU for other noimahnationa 
folhu\ iiniiusluitnly. I'Aniiiinijig iho cnLiinatioii orjiuiUons, wc see that we may 
iisr any niaiiiira proporlioiuil to (lie momoiiL nmliiccs If equation ( 3 . 11 ) 
JiftH II rnnshinl term, il is heUer to iwe inoimmls uhoiii the mean and estimate 
the couslanl innu hy netting the cideulatod mean of the disturbance^ equal to 
rreni film po^sililr niothod of coricnting foi the mean is to calculate 

(7 1) mi, = T T. V‘fit ^ (S I?') 

iwi \i-<i /\(*-i / 

The t*Hliimv(uin pioctslviro foi / 3 , ami the icmauidci of 7 is not ariecl^d by 
t'orrectiPiR fur the mean The compuUitumal pioccdiire indicated lieio la 
iiiielumgisl except for a factor of jiioiKiitionality in the equation for o-* if a 
(lifTcrcnl form of corrcclimi for the mean la used. 

7 1 fVjkfdfJliori pj and UT* It is known that 

\\‘n Klmll iiHO (7.2) to eomputo IP„ Wo bhall compute by the 

rnethuil Kiien hv Dwym* Ill I^ct ua denote tlic element m the ith low and 
Jill cuhimn of jVj, by an, and Ihc element in the lih row and jth column of 
hy (j,j. Ix*l ua couGlnu't the following army 

C|iCn * Cix Cii Cii ■ - Cijj 

diidij ' diir/ii/u fm 

Ca ' Cijt Cfli Cn '' ^11 

dn dif( /ai /w ' /a« 

4 t I 

CJCJT C/ciCm CjtJJ 

fiit/t/ jCi/m IXU 


C,; fllj - 2 ^^*cCfc| , 
)t<i 

e,j ^ hij — ^ duCHj 
k<.\ 


dii 


Cj/ 
C„' 



\ <%<1<K 

1 < ^ < j < -K, 

1 < t < K, 1 < j < fJ 


wliere 



58 


T IV ANDEllFON AND IlhllMVN lUIlIlN 


Then the element m the ith low And jtli column of llm siinmclni itu.lri\ 

iS 

K 

2 Cjli/tj * 

Jt-l 


If we wish to estimate several equations in the sysLcrn lij Him lunHitMl, Dim 
step need only be done once, as and Tl^^, du not dopriul iifHoi Dio 

equation (except that x would be enlarged) 

7,2. Computation of Ptu We shall compute P^u by Die fthbre\’iaU'<i 1 Imiliitlo 
method Let us now denote the element in the rib row nml jtli roliimn of 
by a„ , of by &,j Then let ua perform Die proviouH opcrntimm, not 
including the last step Wo nmy arrange tlie work, if only *mo (Mjimtion m Ui 
be estimated; so that this is already done Then deliiio 

Oi) ~ /i/ t ^ ^ ^ ^ ^ ^ J ^ 

t<kSF 


Then the element in the ibli row and jth column of Pmh i8 Oh. 

7 3 Computation of Pr,Af„Pi,, We know that 

(7 3) P^M..PL = - ii/„/i/;U/«. 

Let U3 compute Px,Af„P« , using (7.3). Wo must first catculftUs . 

We may do this either by the method of section 7 1, or na Pt^Muw - 
7 4 Compulaiion of v, $, and We sliall uso 

(53) 

to compute y after hnsj^ been computed, 


Case 1) Lf = 1, In this case the vector ^ = (i), b=i , 

Case 2) H = 2, p > 1. Let Oi/ denote the element m Dio ith row and itU 

column of PnMa,Px ,, uj(, the element in the zth row and jDi coUimiv of IP 
Define ‘ 


Then 


h = I P«ilf mP« I, 

At - I W,. I 

^ = Kou'Wa + anlUii — 2aii!U)u). 
V a ^ ^ 'S/kl hoKi 


Let 0 PxfMttP XI yWti ,, Then 




4 



On 

6n 




JJitiMl'llON Di-' I'MiAMhTr.llS 


59 


CWri>// III lliw raw M = 0 Then 6 = , and 4 

mil} \)p ashiffiro 

C H ‘^0 \) It > 2, n > II -- i, the proccfluio of section 7 2, compute 


1 “ (/’n 'U *4 Jx'l UH multiply oquiiLioii (5 22) by — - (p 3 ,A/„Pi,)“\ 

ami w‘t 1/y Bb£3 \ W(^ ubUim 


(7 1) (^1 — ?i/)/3' «« 0, 

ulii'ip X iH lli(‘ dinnictorjHtic root of .1 Then wo may employ tlio 

nu'thcKl of Aitkoii [1] to p^ilimnlo X and ^ TiOt 5 c bo an approximation La ^ 
lliP Milumii of .1 \Mlli larRpat abstiluLo valucj la gonernllj'’ a eatiafactory 
n[>iJio\miatioii Dflim; 

f/i ^ vl5,_], 


Ilio (pumlitipH X,/ npproneb X 11*3 t inoicuboa, niul the iiounahzecl vcctois 51 
approm-li Tlie i'un\pigoiioc may bo accolciateil by the meUioda given by 
Ailkcii 'riio luinniLbzrtlion Mliould not, bo ciu’iietl out until tboX,, aicsufliciontly 
clone for (hlTercnt j 

5) /f > 2, /y - 1 Let m go tbioiigb the proccdiue of bcction 7 2 
willi .1 luiil \Mlli no nuiLiix B Tben cim = 0 Sot g,t ^ 1, 

and tompiilo 

(7i “ - £ (Ik , 

Then 


0 ' 



V 


0 , 


7 5. Compidalton oj Wo Imvo 


(7 5) 


= (1 + p)&Wj\ 


If \^cl uiic ibo w*'8 instead of the ni's, wo must divide by and if othoi factors 
of projioi lionality aro used, we must divide by them tr^ la in geneial binsed, 
but llic bias depends upon ibo natuio of ibo complete aygtem, and la not easy to 
cftlcnluto. The biua m of tbn older of 1/7' 


8. ConfldencQ regions based on small sample theory.' If all of ibo pie- 
dcit'rmmcd variabJea in tho system arc o\ogonoiia (i 0 , "fixed’*), we can obtain 
coiiiidoneo regions for ibo cocfficionla of one equation on the basis of small sample 
llKHiry, To do tliis wo roquira only that tho clisturbanco of the selected equation 
bo mwmiilly diBiinbutcd', that la, the Imeac foim in the obaewationa 0xt + ytti 

* Wq ftro indobtcd to Profeueor A. Wiild for aBBialnaco in Bimplifying our approaoh to this 
problem 



60 


T w (VNDEnsor^ vMD rtFJri^^A^ nnji^v 


IB noiiii 0 .ll> distributed with mGau sscro and \flriaiicc a Tlin ri^ri nl Ibis 
on fixed varmtes 13 ivormally distnbuted and coilam (/imrlroiic fiirms ui Om^ip 
hneai forms have x^-distributiona On the basis of this e ean M5t up i onhdrucG 
regions for the coefficients 

In addition to asauniptions A. and li uae tl\e Callow \ng 
Assumption E. All of iho coorrfiiwfcs of Zi ^ (ip n) on rxogrnoMn The 
moment matria: fion-*i7ij7»/Qr. Tim dtsiurimcca of Oip /i r/rd r/^unlivn ar^ 

disinbuted independently and normally iwih mcon 0 and i-anqiicx! / 

Suppose ne Mve a set of observations ( 3 : 1 , (xr» Ut , Ur) If 

ive know /? and y we ciin obtain T values of 

(81) w, = ^x\ ■{■ iu‘\, i “ I, •>,!’. 

The Sample regression coeflioients of W| oh Ui mid s, nro 


(SI) 

(83) 


1 

^ ^ 2 * ^ W^t'Ufilfuu ^ A' "Xj 

1 ’’ 

^ If, 2 liJisiitf?/ =1 

1 cvl 


The two vectors c and <? are distributed indDpcncIcntJy and 
and covariance matrices 


normally iw’tli mcxoi 0 


Hence (by usual regression theory) 

(8 0) C = i - i + W..y -I- yMa' + yM„y), 

(S.7) S = ~eM„e' = 

= y(M„ ~ M..Mz[u.,XM„ - 


p ^'1 « 13 ', 


aredistzibuted inclopenclcntly asx^wjtli/'" D and T k’ \ e 

XtlTelJ"; 

regions. oonBidsralions we cnn obtain (,|,o ilcairixl (-oiilkloiico 


«S 

(8 0 ) 




- i 



vsriMsnnN of inTiA\thTrtis 


01 


IJ'/II tf >!'„ 11 II !;iin> iiirHrix, (ii) n ronftdcnce region Jor 0 oj confiitenoe e oonmts of 
aU li* Sill nnd 


CSJOj 




re r s 


vhrtr I'b T xfe) tlmm m Ihr ’imhahihiij of (8,10) for /3* = /9 is g (6) A 
rnididtnirp rf'f/ion for niul y «iMii(/!ancoii.5/j/ rojisifffs of aU and y* satisfying 
(<S !J) and 


li'lj-., U„',.U„rt‘' + + y*M., 0 *' + yH!,„y* 

, ' >11 

(RU) 

rn _ 


(r) If thr fwrmaliration is a = L, then a confidence region for ^ of confidence 
fi «3 consistH of nil fl* satisfying 


iX 12 ) < xlUi), 

(a m \\^K (ti) < < x^-JT (O, 

ifAert* chosen so Hint Ihr jjrobahitili/ of (8 18) is when 0* = and x^r-icCci) 

and xV «r(<:) are cfmen so that the piobabiiily of (8.13) is €2 when 0* — 0 and 


(H l-l) 


x'(fi) < I < 


(fO jI Ciin/iflf jirc region for pan y spinilltmcoiisiy cojisiais of all 0* andy* satisfying 
(8 13) and 

(8 IS) -1- 0W..y*' + y*JlfudS*' + ■v*J¥„,r*' 

+ < xl M 

llpyion (c) is the interior of an ciiipsoid and an ellipsoidal ahoU in the j3'''-apace, 
region (d) is aimiJnr in the /J*, 7 *-spaco Region (n) consists of the intersection 
of the qimdnc aiirfnce (8,9) nd the interior of a cone m the i3*-space, region (b) 
ifl aiinilnr in llio 0*, 7 *-spncD 

It iH clear that there are many other waYS of constructing confidence regions 
by taking rogrmion on utliei b\cd varlaLca Of these the best seem to bo those 
of tlieorom3. It has been jiroved 12] that the legions ol thcoicm3 are consistent 
in the senso tliat for audioiei’tly laigo ?’ the piobnbility is aibitianly noni 1 that 
all of Lho confidonco region is \i ilhui n certain distance of 0 or 0j y For an 
uppliciilion of tliis Lochuiquo to conomio data see a paper by Baitlett [3] ^^llo 
Biiggcate<l Diis method indcpondciiLly 


9. An approximate small sample test of restrictions* When /3* = 0 , tlie 
proLab lity of (8.10) is «. If 0* is replaced by ^ which minimizes the expiesaion 



02 T W. ANDErtSON and llbUMAN llUDIS 

on the left, Uie piobability )3 et least os grcot; it is,, Boy, 1 - 1 . 
the smallest root of 


(9.1) 


^ jlfjl il/11 ^fX 


T 

^ T - K 



0 , 


Tins ratio ii X, 


Since 
(9 2) 


A 


T-K 

TD 


nheie V IS the smolleat root of (4 11), tJio probflbility of 

n*T) 

(9 3) V '> ^ 

]g S ^ (1 — «). We summarize tliiB as folloM’s. 

Tdeohem 4 Under assumptions vl, B, E, Iho vtcquahly 10 H), \rhtu v js 
tlie moUesi root of (4 U), a lest of the hjpolfima ihal Iftf r^tr£{arnls nf 

Vi in the sekclcd structural equation are zero of signijicancr tess i/ton 1 ^ 

This test 19 simply an approximation to the test given in ^c^Tn)n (} This 
exact probability, 6, of (9 3) isnnknoxvn; in fact llie iliHtnbutum of v ilepomln on 
IT^o and the distribution of 5( Ilowcvei, since 5 lies between 0 mill 1 it 
know that if the test jg used as though llio level were 1 — it Iho will im 
"conservative 

Another approximate teat of the rcstnctioUB can be obUiiiiiM.l from (In* in¬ 
equality (8 H), If the hypothesis is rejected on Ihe biiHis of one of llnw* 
the corresponding confidence region (for /3 or fur 0 ami 7) m jinaKunuj, fnr all 
|5 or ^ and 7 are excluded It should bo noticed Ihiil tho imo of u mvmi rulin 
to test the hypothesis at significance level i(<l - t) dop^ iioL iifTm’t [Im riip- 
fidence coefEcjent e of the confidence icgion when llio Iiyp(ith0'*i*i ji Inro 


jLvibi' 


(1) A. C Aftken, "Stucirea in practical malhomtiUi 9 If Tfip pvtihuiliitn cif tlu* Ult'iit 
roots and latent vertors of a matrix," Ediub Math Sur Pwc . \ ol 5? 7) 

pp 26&-305 

I2j T- W Anderson and Herman Huhin, '^Tho naj'inpiolic prujirriiF^ uf I'Kl.jiiiAU^fi r»/ ihr 
pammetere of n singJo equation in a cornplclo Hj-Btoin »f BtorlmHih cquRtinnn," 
to be publisbed 

( 3 J M S Dartiett, "A noto on tho atatlstical oatuiKilum of tlcnuiijiJ iimr Rijjipfy n Uliinm 
Irom time senes," ^Iconoinctnca, Vol 10 tlOlH), pp 323 ,1211 

M I ® "Muation of linear fonna," Psychomrlrdn, Vol. 0 CH>H), Pp IM 

[ 5 ) H, A. Pisuhh, Xho Btatiationl utilization of mullijilo inP«flurmiiojHM " /Inrwi/^i ui 
Emnicat Vol 8 U038), pp 370-380 

f6JM A Ginamca and T IIaavblmo, "Stahsimnl nrinlyaiB of ilm rfmtmntj for fwHl 
vTffiaot?) "" \rUo“ equations," Kcommrtnrn, 



OS' PAlHMLl'LnS 


63 


(Hj {* I. llhif, “On UiP iir(il»l('u\ of rcink ftud tlic liimUuK dislnlmUoii of Iiehcr’fl frit 
fuueLum/' Aiinn/s of Fugi^nic^, Vol U (lOll), pj) 39-11 
I'Jl U H M\\S AMI A \\ Ai ij, “Oil Oio aULifittCal ItculiuotiL of [iiieiir stochnsUc ditTcrcuco 
niualums," fVimfirurJrira, \ ol LI (1013), pii 173-220 
ml Oi Av Uhun«0t., “Coiillm'iu'o [jiiiaUfliB Uy meant! of UiK imunoivte and oLlicr mctlwdfl of 
cmilluom'i' iinnlyRia," Lronomclncn, Vol 0 (toil), pp 1-21 
lUl XlrtliadraL /ii/rrf'nct' in Oj/iuiimc Kronoifnc A'lfaLcmB, to lie publiaUcd as Cowles Com 
inHhuui MLoimgrcipU No 10 



SOME SIGSmCAMCE TESTS FOR THE MEPIAN WHICH ARE 
VALID imDER VERY GENERAL CONDITIONS' 


By John £I. 

The Rtind Coi porahon 


1 , Summary. Order statistics nro hhgiI Lo derho Hi^rinhcunn' fur iho 
populatioT) median avIhcIi m'C valid under very general riPiidUioiiH "J 
are approximately as powerful as the Student < lest for Hiimll wiinpl^^ fiiuii u 
normal population. Also the application of a tc'jt require^ very lillle i r.ni|mjji- 
tion Thus the teats denved comparo very favoiahly \\ilh llic MrM fim 
sets of obseivatioDB Applications of these ord 'r aiatislic leatfl lo < erlum \wl| 
kIlQ^vll statistical problems aie given in another Jjftpcr jl) 


PART 1. EESULl^ AND PDFISITIOXH 


2. Introduction Considei n indepeiulcnt obscrviilifiiH tlniiMi from n 
lations satisfying the conditions (A): 

1) Each population la continuous (i,o. ils cdf iB roiUnnuujB) 

2) Each population is symmetrical. 

3) The median of each population has tlie same \aliio ^6 fJf I In' point 
of a continuous aymmotrical popiilalion is not unuiUe, the jiihIuid ^ nf llu' iw/pij* 
lation la defined to be the midpoint of the scginciiL of oOSV viiliir'i) 

It la to be emphasized that no two of thoobBorvaliona uro in'ri^winlv iImiuh 
from the some population. Significance Icata arc dcriM'^l lo coinparp ^ \wi)» n 
given constant value (fm . 

A general method of obtaining one aided and ByrrnnrliH'«l lr*i>U is gn )‘ii in isn* 
tion 8 This general method furniahea testa whicli Inivo aigiiirn xincn nf the 
form r/2”, (r = 1, ■ , 2'' - 1) Each value of r can bo iillnnuNj fnr muuv one 

sided test Unfoi tnoately testa obtained by the general nmiluHi nro \ I'tv 


to apply from a computational viewpoint If n > 10, ilje nuinlior of cinupulii- 
tione required foi the application of a teat ia prolnlnlivi* 

To overcome the computational difriciiltymvoli'CfJ in iinjng ihp gcru-rul niollual, 
easily applied tests using order statistics are denvod Theso tewts are Imstvl - i 
ordei statistics of certain combinatjona of order statistics of (lit* u oliH^rvnti 
each combination being either a single order statialic of tiu' n. oljHi-tMituiurt or 


one-hall the sum of two order statistics. Xho teats are in\ uritiiit undrT (icrjiuilu- 
tion of the n observations and have BigtuhoaiKe Icvida of thn forni r/3''^ 
(r 5? 1, t2 — 1), Table J. contains a )j.6t qf some onc-Hided itiul aviiuiiutHi hi 
testa for ti ^ 15 (fCi , ^ ■ ) ir represent the observations iiirauv^cd \\\ uu’reaavnRj 
order of magnitude) Additional QignSficanco tests can lie obtauic^l bv iiiJt* of 
Theoiem 4 of section 6. 


preeented m this papor viem obtained iii the courao ot reiicntth 
the ut“«% ' 


64 



i^oii III] Ml nrvN 0,) 

W .1 ‘\inuM'«ui ,il l»!vs 11 iiKMii, Uw‘ mt'.vii Ua'i Iho vtiliic as bho 

iiiiriittii llnii rl ' nil |nj|iiiLil]iHi fruiii wliiili an ulist'i viiliuu is drawn aatisfips 
I In 'nMitiMriil ***11*11(1011 dial ils iiUMii p\isbs, Uia mndnni tests douvcrl m Lhia 
fi.ilKi ar*' .(In'* I f*f l)i(- MU'in 

Mdnaiul*- '*! uiiliki K (hill 1 oiiditiiiiis (A) iiU' nvei nxiutiy siiti'^licd lu puu'- 
h« i‘, (In r 1 oinliMuim .i]ii}i%ai to In* iipiiniMiniilnly Milisiierl :ii nmiiy piuidiml 
Af'Ht'iiViT* tiii'liljrijis fA) aiiMif Hudi i‘iiinjdoform lluilap])io\iiiinle 
M iitnatiiiii * 111 tir^nirnllv In* olilainisl \udimil an oxteiiMive mvesUKiiUon 
Ciitaiuot (li*> flick r'.(aliMH tests uu* veiy ellineul if Uie n oliscivalnnis aie n 
‘^uii[ili* fumi .1 riMiMi'll jhi|iulatii>n J'lllu'ienru's nio listed fui sumo of the testa in 
Tallin I 'i'ln 'll' fi its 'ire iii)])ro\j|iiuLclv nn eflieieiiL as blie Student MghL (The 
I'lln ii‘ni \ nf 'I (phI , iniHp prei iMoly the pmvei olliL'ienoy, is dcliiiGd 111 SGubioii 3 ) 
Llie nidiT siadstn ((^(s aie lumpetiLive with the Sfnilent Uest In ohoosing 
In iMi’en dm IW0 lypi's «*/ (emh (ho follovMiifr t unaidouitions in-iy bo of inleicsL 
tsil TlMMiider‘'1iifis(i( (eslsure valid tindei miiehmoio 1*01101 al eonditions timii 

till’ I li ‘ft 

dit Tike iiulei st distie tests an' almost as ellieicut as live t-test fot Himll sam- 
pli s riiiin a Jioriiial population 

fid rlinoidci stafi'-ln I esis ui i'inuie eiisily eoin]niLcd than the Moat 
id) ion die fa'll' uf a Hample fiom a 1101 mill pojnilation and iioui sigmriennuo 
(ho Most |i;ues mmo mfoinuilioii than the ovdei stalittUe testa 

III SOUK* ojiH's a s(*[ of ri indcpeinlenl ubservalamH satisfying only 1) and 3) of 
I'l'iKldniilH (A) fan hr* linnsfoiinefl mtu rilisei ealiona aiipiovnuately aalisfying all 
of fonililioiiK (A) hv all iLpjnopii'ite lonlinuouH monoLonie ehango of vaimblo 
Tot e\auiple, n'lilaemg eiveh tiliveiviiLiou by the logLUiUim of tlie value of the 
olisuaiioii siiineliniea lesults in a hc'I of oljsjei'vatiiiiiH having uppioximatoly 
iniin'tniid dis(iibndons Sineo (he tiansfonnation, say g{x), is eontiiiiioiis 
iiinl inntmlome, llu* leHnUiiig oliseivations vviW have median (/(*/*) if the uiiginal 
(jliH'ivalioiH have median 0 C'onfulenee inteivals can lie found foi by hist 
olilaming fourifhnt'e mleivals hn on tlio liaais of uoiiditioiia (A) nml then 
iiivoitiiig SiKinlu'.inee tests eiin he obluiiied fiom tliesc eoiitidcnec intervals 
Tile tests of Ihiit I f in he applied in fmnish geneudixed solutions foi scvoial 
ell kiunvii slaLi‘«(u al piuiileins Some of these apphentions me given in iiiiotlici 
j per [1] 

One applieutiori on ms in cases nlieie Llieie is I'oasim to believe that londi- 
huiiK (A) me «ilislii d hut theie is no reason lo iiHsiimo that the populations fiom 
wliieli the uliH'ivations wein diauii aie even appioxnnalely the same Peilnips 
dll' riinat eonimon siluidioii of this ty|>e i>^ dial m uliieh Llio value of a ecrlain 
f|iiantiiv iH eviiei iimMil'dly fleleimnied liy aeveial difTeieiit melhoda, all of which 
hlumld ihenretu'idiv yiohl iho hiuue resull 'Hien theic w no ica-sou to believe 
Lliul nit lilt* e\p('rimeiiLnl values linvo the same piociaion It may bo poimissible, 
liowc'ver, to nssinne that eiieh value is an uhaeivaLion from a continuous syiU' 
moli'ieal iKijiuliilion and tbnt all the populations Imvo the same median Then 
the Older statislie teals ran be used to teat the true value of the cjuantity uivcali- 
gated Foi example, Lonsiclei die clcteimuintion of a specified physical constant 



66 


JOHN E 



jf5 


fSi ^ f'i 
ci ctj :: rj 


''i. ■" 

^ -T 

h H 


S S « 
^ g E 


VV V 


*«■ ^ 

V 

-I" 1 I ^ 

" I? I? ? 


H 

Hfrt *iin i-Nti 


V 

!l 


d ^ ^ 
E § a 


* li 

r 

i? 


‘o rt 00 
CO 'O Cl -H o 


Co i’- M GO vr 
Cm •-< O o 


TO 


a:" V S'-' ^ 

^ C -j? 


AAA 

0- 

h A ^ 

-' ■* U /\ 

r« r- ^ 

H H S J- ,*^ 

^r« pd^l Mr| I * 


rk 1 i« ^ 

H a^ -^ i I 


0 C5. 45. 0 

A A A A 

»" »* H H A 

-! M I 

*-* » •« » 

'» >» ►* 

•-^ ♦ «ii I ^in „^r • 


u 


^ * 




s 1^. 

c V c* 


1 .»' " 

h W “I • 


^ i 

c* p 


G p 


^ d A- 
V V V V 


• e: 


H H I? »? V ! 

^ i 1 I 


N ^ fi H 

Wiu. 

"# • ritlH 


*1- 

K *i *■< H hn 

V j F I » Jj Ira a 

^ d d d 

E S E e 


Cl TO O <M TO 

> I 

Q| V}< Aj 1^ Q 


•-< Cl O '.is ’•T^ 

' . 'I 

ir (^1 -H o -5 



‘V < + ^■^)V ‘ i ^ -i- I ' 'J^h'sus 0 T ' 9'0 

eo < r(^j 4- *cj]uim ? ^ > {(o*r + sj)| fijr|\Biii yz I'I 

“d < -|- 'r)*' ‘ »j-]uiui ^ > [(orr * ixJxBOl X ? 9 Z 

^ < [^.'^ "5“ + »x)| * j II 9'e 


S^f^^'JHC\^fJ, TJ.S 



4 7 9 4 inax[^(i4 + Xi 

2 3 4.7 niax[|(2E + 

1 0 2 0 niax[|(x6 + X] 

0 5 10 ma.x[xu, I 





















68 


JOHN B WAL&II 


Vauous scientists obtained eypenmental values foj Lli.s cuuMaiif l.y oral dilTer^ 
ent methods If it can be assumed that each value is all nlM-n.-ilmii fi<uu u 
continuous symmetiical population and that all tlir^ jmjiiilatu.i.. luiillie ‘-amu 
median, the tiue value of the pliyhical constant can bt tcslcl In apiils iii|? llic 
order statistic tests to the totality of the cvpcnment.d i allies 

3 Power efficiency of tests. A pioldem uliich arises IhimiKhoul llm impf-r 
IS that of detcimming Jiow much mfonnatJOJi J)V ollu-r fr-l ju 

place of the most powerful test of a given liyimtho‘^14 llic (|iiiiiililuU\ •• JiiriMin' 

of the amount of available infoimatioii which is usetl by a \m 11 bi* Ki\i‘ii us a 
peicentage and is called the pow'oi clficieiif of the lest i oii^idcffsl 

In all cases investigated the undeilying pupiiluLum is nouu.il willi iiiikiuiwn 
variance and the hypotheses tested coticciii tlic jKipiiLiliou fftr^lii’ii) tiijrviit) 
Then the most poweiful test (one-sided oi bymmctru iil) is llti' up[ir(.priulo 
Student i-test 

Theproceduieused to measme the powei ollicioncy uf a 1i‘sl is thflricnt from 
the common method of measuung the eflicicncy of im csliinuU* Tliu i‘llnu'iu’j 
of an estimate is obtained by taking the ratio of tlic i (inum o of nil ollu u iil csd- 
matc with lespecb to the vaiiaiico of the given csliinalf (inpiiwsl us n piw- 
centage) The method of deteimimug the powoj ellnimii'V uf n loil* houi 
consists in continuously varying the sample size of the aiipiupruilc iiir>^l pipwerfiil 
test (same significance level) until ihc power funcLions of liiu given \* st ami llm 
most poweiful test aie equivalent m the following wuisc: 'Ihu uicu br twi t n (liu 
two power ouives for which the power lunctiuu uf the inusl puwi iful ir^-l oxi’cihIs 
the power function of the given teat la equal to llio aimluguiis nriM ffii v\Iiii h llin 
power function of the most poweiful test is less Lluin llial uf ihr Mi\ mi I (U 
la assumed that the power functjoiia of the U'Sls cun In* inadv lu ili'pcml mi lliv 
values of a single paiamoter) The sample size (not ni'r('f>s(tri{y iuUytal^ of (ho 
most poweiJul test with equivalent power Jimctiou druidi'd by the Mu/ph of tho 
qwen iesUs called Ihe power cJ/tciejicy 0/ Ihe gmn Irsi (tx/ircwtftd rn n /irRMihij/i l 

In obtaining pow'ei efficiencies m the manner deliiKKl ubovf, tin* hiuiiplu miiu 
of the most poweiful test is allowed to assume non-inU'giul valllu^ Tins fur¬ 
nishes an interpolated measuie of the same size of the must piuwrfnl Iv^r u )ii< fi is 
pmver function equivalent to the given test As pmulvd out iiiiuM*, tlu' I tust 
is a most poweiful test foi the situations ('oiihidei cil m tins inqn'r A nuM IuhI tif 

computing power function values foi i-lests liiwiiig iion-iiilvgiul Haiii|iK* fti5if"i ih 
given below 

The definition of poivei efficiency selected m very emnonii'nl fruiii n iMiniiMitn* 
tional point of view Powei function values fur tlio (tosl nin hi* isiHily coiiiimltNl 
through use of the noimalappimirnation given iii [2). For (livmgiiiliniiiru hwvU 

considered in this paper, the normal appioMiriatiuii is icinumnbly rtViurntu if 

e sample size is not too small lu the rcmuinmg eusi's tin* npphiMiunluiii 
underestimates some powei function values and 0^mreHtlm^^lc1 (illiom Fur the 
Hituationa investigated, howevei, the erroi mtioduccd by Him fumbinfiiion uf 



TAin.K 2 

s and i\(nv<r /TiiicfioiL I'nlufs/oj* ccrfaui order slahsiu tests 


iihoc lortL 

hllTilJjlo 

Si/r 

1 

ApTiriiv 

]-'ni 

Sigmf- 

iniiicc 

Viilnca of Pcn\cr 
Function 


(‘ICIICV 

I ovcl 

■— - 

- ™^ 

;—-- 




5=0 

5 => 1 2 

fi ■= 1 8 

“ “ — __ ^ _ 

1 






i 

1 0 

/u 

0025 

1 

337 

.755 

9Gi 

l(u 1" >&) < 


98 

0025 

343 

.755 

, 058 

t 

fj 82 


0100 

327 

779 

980 

ina\[r5, + jo)! < 

(i 

97 

01G9 

334 

779 

1 972 

1 

88 


1 0312 

214 

G82 

951 

U'f. “1 rfil < c/m 

(i 

08 

1 0312 

254 

G87 

1 912 

1 

0 CG 


0517 

406 

1 

, 809 

994 

inii\(rt,, Hfi 4 -Ti)! < 0!> 

7 

05 

0517 

U3 

.867 

: 991 

i 

(5 65 


0231 

239 

.710 

1 

1 9G9 

iiutsla-fi, i(j5 A- C7)J < 0c> 

7 

08 

0231 

249 

, 717 

1 

1 962 

t 

7 55 


0130 

395 

882 

990 

nni\(a« , Ur, ^ < c/>o 

B 

94 5 

0130 

101 

1 .879 

993 

1 

7 85 


0117 

174 

GSO 

956 

inuKlrj , 4(r« H- r«)J < •!><) 

8 

98 

.0117 

185 

G50 

.949 

1 

8.G1 


0215 

.302 

,839 

094 

mu\[fr , Hrs tq)1 < ij?o 

9 

90 

0215 

311 

834 

.990 

1 

8 9 


0059 

.127 

.597 

917 

ma\[r», l(x? + Jti>l < </»o 

9 

99 

0059 

137 

599 

935 

1 

n 


. 05^17 

450 


998 

3*9 < ^0 

Dl 

75 


454 

1 

995 

t 

9 05 


BMM 

,227 


.991 

rna\la,i ,4(^8 -1- Tin)] < i>Q 

10 

00.5 ' 

1 

mWm 

,237 

78G 

086 

1 





008 

9G1 

itia\[3rg, K-fi H' 

IB 

ss 

1 

KB 


.077 

952 

1 

8.9 


,0059 

.141 

021 

1 951 

*l|0 *^'0 

11 

81 

0059 

1 

.152 

.C31 

942 

i 

ma 


0102 

277 

870 

998 

maxlro , K^o + ^ii)J < 


93.5 

0102 

288 

SG2 

995 



















70 


JOHN L WVbSir 


vindeveaUmation and oveic'^timatioii leiich lo cniirrl ouf ni tlie di'Iorrjim.ilifm of 
powei efficiencies if the above aiea clpfmitioii ul rrniulili of iiffvu'i lutn lions h 
naed Thua application of the noimal npjnoMnmtifjii iir-Ms re is-onnl»l> m 
curate power efficiencies foi the case,g coiiauiernl m tins pipi^r Tm* „f Him 
noimal approximation fmnishcs nu easily applied inelliutl lU uhrumuin 
fuaetioTi values foi f-tests having noii-mlcgrnl wimple sizei 
Table2 contains examploa of the above dcfteiiherl luetlind of drlorimiiiii^ piiwcr 
efficienciea Heio the powei function values foi Hie t IrsL \ien* i'mii|iiil(sl umh^ 
the normal appioximabioii E'iammntioii of Tnlilo 2 hiioi^^i Hint tlir niMMiunm 
diffeience between coiiesponding power fimclmii values for the Ivui types of 
tests la small for all the caBca considcicd Iheio This ImJds iii the deLerniinutioii 
of all the power efficiencies listed in Table 1 
Investigation indicates that tho dorinitjon of pnv\or cnicieiiiw given lu'ie h for 
all practical purposes the same aa that given in [Z\ 

For the situations considered m tUia paper, it is nufUcu'iit lo roslrnt power 
efficiency investigations to one-sidccl teats. I'lvcrj" hynunelrn lest iii\is(ig'Uc<t 
can be considered as a combination of two non ovi'ilapping onr’'’4idivl IimIs, 
each having a aignificnncG level equal lo luiK llmt of llio s\ niiiu (rn* le^l Aki, 
from symmetryj these one-sided testa (each eonaidcml in a Hqmriilo lisl.j tpive 
the same power efficiency, Thua iL la an intincdiatc rouMVpU'in n of tin* dnfiiiiiiun 
of powder efficiency that the syminctuc teat baa tho eaiiio rlhi jvuoy 11*1 oai'li of the 
corresponding one-sided teaLa nt half tho significanoc* lovrl 

PART 11 DERIVATIONS 

4, Introduction. The puiposo of the imaindor of the paper in to pu'^u'iil 
derivations of the significance test results slated in (wctionn 1 and 2 The liral 
derivations consigb m obtahiing confidence inlervah fur tf> on the Iuihih of I'niidi- 
tions (A). Then properties of these confidence intorvah aio niiulyiWMl Applica¬ 
tion of the confidence mteivala and then piopeiTica Lo aigoiruMiiU'e Ivritx furniiihna 
many of the results stated in seotiona 1 and 2 The rcnuuning doriv alums arc 
concerned with efficiencies and tlio general method mcnliunctl in ftcclum 2. 


B. Derivation of confidence intervals, Let us coiihidor ;i iiidcpniidniiL oli- 
servatioDs, each observation being diium from a possibly tlifTpruiit popuhviion 
Denote these obseivationa by , , y, and Jot llio ctif of y, lie rivcii hv b\ , 

(^ = 1, , n) Fmtheiinoie let tho ft populaliotis from wliudi iheoo n iilj- 

BervationB were drawn satisfy conditions (A). Then 1} of cuniliUuim (A) rc 
quires that each F, is contmuoua, while 2) niid 3) stipiilftlo llmt 

dP.iv, - 0) = / dF,(v, - i), (1 - 1 , . , 

^ c 




for all values of e in the interval ~ ^ < c < „ 

*0'’ ' ' 'V" ttnnngccl la lucreahing order ot KMigni- 

tude. Smoethecdf8areoont.m,ouB,Pr(a.. = For Uieeituii- 



Slt.MUfVNfK USiS lOH nu MEDIAN 


71 


(rcjiLpd 111 tins [) ippi, it js siuflipipnt to coiisidei onG'SKlod confidence intor- 
viiK for •i> Mi one mded ronlidcmr miei viils dciived luivc otic of the foiin^ 

^ £/( 11 I j T-n) “C 

I ) Sfl) ^ l/)j 

where g mid h ate Ihnel inenstiniljle finiUions of Xi j ,x„ .inch that 

Vr[\)Ui I , u) < , I'n - 0) < 0], 

I*rlh(xi , , rj > - /'t(/i(ri - </p, ■ , ar„ - 0) > 0]. 

CVmmtk'i the mhliliontil eondiliou 

(H) 'Vll pupvilalhius me the lamc 

Ill teiriH of eumulalive distiilmtiou fnnetions, coiulitioii (B) icquuci that all 
Lite I'df's /'\ luo ('t|uul lo ‘^imo edf /•’ A tliroiem will be piovcd wluchshowi that 
all eiinhdence mUivals of the forms (1) deiived on the Ijuhii qE both conditiong 
(A) iiml (B) aie uki valid if only couditious (V) iieec'^sauly hold, i,e if 

/V[(?(j| , - - , In) < 0l J? 

whenever Ji, , a are order blalisliei of oltierviUions from populations safcis- 

fyiiitt eonditioiiH ( V} and (B), then Lins inohuljility expicssion alio lias the value 
p if ri, I >Cn aie hum popuhitionH necessarily sntiifying only conditiona (A) 
iSiinilarlv for rr[A(a.i, , > </>] 

TuuiiitiM 1 Let Q(£\ “ 4*1 ' ' , In ~ (fit) f/c pro!/ahdd[/ sfaicnicnt ^nvo^vlng 
(#), , In ” deyflies a Borriincasiirafi/e region h!(ii— </i, • 

dJ f/ir ii'diiMcasitnud order stafistic space // 

(2) Q{t: - ifi, ■ , In -</))“ p 

lohcncecr ri, ' - * ,Xn are order staUaim oj n tndcpendcnl observalwns Jrom popnh- 
/i 07 ii satisfying condilwna (A) and (B)i dicn (2) also lioJds 'lohm , f x„ are 
order statistics of n i/idcpcndcrtf ohscrfalwiia from popidattoii-s necessarily satis¬ 
fying only conditions (A). 

PnooF. It is Huffieiciit to roiiaider tlie case m which 0 = 0. Then, if condi¬ 
tions (A) nro satisfied, the joint piobabihty olemetil of , ■ ■ , in 10 

‘ I In) '=‘2 dPl(li(l)) ■ ■ ' d/''n(lr(n))j 

t 

whero the anmniation is taken over all permutations ir of the inlegcrs 1, * , 71 , 

and uro cdf's of symmetrical populations with ssoio median. Let li « 
^(ii j ■ * f he the region of the a-dimcnsionnl ordoi stalistio space defined 
by Llic probability statement t3(ii, ■ * < , in), Then Tlicorcin 1 stipulates that 

f dFCxi, 

Jk 


( 3 ) 


I I I 


I Ir) ~ V 



72 


JOHN L 


T\lienever , 2/n aie from popultit-ioim wLttsfi'UiK caiiditiuiis (V) (H) 

with zero inediaTi. In this caBC, howcvor, curli Fi = F und ^a} In'toinr-^ 

(4^5 111 f 11 ^ P> 

where F la the cdf of n popnlfitioji siilisfynig eoiuJUioiis (V) and (Jl) willi mo 
median, Let 

p - II (i: di'M.)) 

and define jS?, to be the suiri ol all teima m Iho c\]i!UiHum of i* wlnrli nintiilii a 
specified a of dFi, • (IK nncl no otheis, the piirticulor nol tdin^'U is ili n<iH?il 

by J3, where I? = 1, ' , Then 

P - F{^l, • > 'kJ + £ 1 + ■' "h S Fn 

a B 

Now rongidpi any given jSa (i e a, (3 given) Define dll In be (bo Mini of (bo 
a of dFi f , clF„ peilflinnig to fi plus niiy set of zeio oi innro of llio reinniiiing 
dF’a Then nomattci winch of the romaiiung d/’N me rliuK’n for dll. On wiini 

n 

of those terms m the expansion of JI dJ/fnj) ’wliich eunlmn (hu parlnmlm n'l nf 

1-1 

a of , ’ , (IFn 13 always equal to iS« . I^L 

r. = ^ (ll rf6l(*,)) ■ 

where dA eqiinis’tlie sum of the a of (IFi , ■ , dh\ pm taiiiiiig (n fi. 'J'Jii'H from 

the above and the gymrnetiical fashion in which the dF% jie ( 

= r + JfK Z su 4- /cr*^ z , 

where the If (^ = I> , o: — 1), aieconatanh 

Consider the case m which a - a — 1 Uwing (he above e\pn‘i^‘iioU fof t 

P = d7i'(a;i, + 

"h (I 5 -h ‘ (1 — *) tS'J 

Repeating this proccdiiiQ successively for« == n - 3, a • ,1 hIiuwh (lint 

,!„)=. P-b -b .. -bCir., 

where the , (a = ], ^ ^ 1 )^ 

Since each P, is the cclf of a syminctiical populiiLion witli /cin mmbun, 

P 1 

Oja = - (sum of the q: of Pi, >. j<\ pcvlammg to fi) 



f il'iis joii iiri \nni\N 


73 


iniKi. Hie ult tor 11 romimiiui', sjmiuctiiiill iiupiiliitioii nail ^ciu moiliiiii But 

■ ‘'"C') “'■? (ft 

IIuKr fih{r\, , j^) n equal lu a sum uf leims (mullipliwl by eel Cam con- 

bl’iiils) of l1i(> roim 

Mlieie /•’ IS rlii‘ etlf itf u (onUmiuus symrneUiiiil populalioti ^\ilh zoio mcduiu. 
'L'ims fiuin (1) aiul tlif Imcai piopisLies of the nitcgial, 

I = V 

if Vi I 1 Vh fiiiiii jjojjiil ilions riceessimly MUisfymg only eoiKliliJona (A) 

<■ d 

Ni'\1 «uu(uUMi(ie, mler\ als of flip fuims (1) ill be denved foi <ti on the ba&is of 
eoiuUlnms (A) uiul (14), Hefou^ stiami!; the lUooiem on wlueli these conhdenco 
Intel\aK nn‘ leisei! eousulei the following dpliniLiuii uf nnlulioir Foi oacli per- 
iiU'‘''ible 'ji'lislion of i nod j tin h^inbol 

fb j| (1 < i < J < 71) 

deuoti'H all arlufrurif 1ml faeleelmn of one oi both of the mcqvudity Bigug 
> 'J’liL* Keletdoti of both inequality signs, denoLod by 5i ba^i Iho mteipic- 
liilion 

r, ^ 0 :l? — 00 < T, < eo 

(a:, d ^i)/2 <(i 5.^ - CO < (t, + xj)/2 < «3 

U iH lo be lulled Ui.ii \r, .s) is not ueeossauly etjual to [i, 3 \ unless r = i and 
a « j 

Tin Dill \i 2 (’onsfdii the prohabihhi i,tatmcnl 

(5) fVKi.T ¥,)/2 0)1 < t <j < 111 

Ld f/nfi Wufemrwi /nJii' (hr lalur ly i/ ii, ■ , in cue oidtr a/aii5<i6s of a sa}?iplc of 
JUrr n drnien/rt>?/i t/n' imijom popnlalwn wUhiattgc — Wo (i/icn « 0) T/ifiii 
(fi) u(«o /um die wilifc (/ 1 / , In UK' fli'dci sfafisficii of a «ampfc si^c n dramri 

from (Uiif 7 w/judi/i£j;( miisfifin^f (A) and (14) 

IMuioi liid T/i, ■ , f/„ be a wmipli* of n values fiom a populaLioii satisfying 

InudiliniiH (A) ami (14) while u , < , I'n aie tlie //'b ainuigcd m mcicabing oidci 

of JuagniLude. Then llieic is a monoLoiuj funcllon tt (see [1]) such thiib Tr[z) will 
Imve the wuno eilf na y, — i/j if z is fi um a unitorm population with lange - j to ^ 
8inee the i/'aaio fiumaaymineti'ical population, — tc{z) ^ 7r(- z). Lot — 0 = 
’'■(si), (i == 1, ’ ■ , a), debne the z, Tlien 



74 


John ivalsh 


PrK®i + j|OI 

= - Wi,)] 


From the monotone ond symmetricol properties of the fiim imn t 
PrWr,)|t,j) ' irfe)l “ PrKi,)(>, 


By hypothesis tins last expression has the valuo g, llnw wiiii.lotmn < lio iir.Mif 
Many of the probability statements of the form (.>) Imvo wrii sirnlmlnltly 

For emrople, Fr[ir. > d>, *» < -f. ' ) = 0 

result m equivalent probability statoincnLs l'\)r omuhiiId 






An jramediate consequence of Theorem 2 is Hmt onfMulcd ronlulcm-i- 
vals can be obtained for ^ by choosing ntiy spcnfioil Mibwi of ix( h 
(I < I < j < ji), and consicleiing an aibitrury buL livnl older HtiUHtu' of llu* 
values of this subset For example, consider the eubset eonsiatiiiR of i nml 
(x „^2 H- Then 

Prlmaxla/n-l j (iCn-S + 35n)/2] < ffi} •= 'h 


where 

f < if either * = j *= a - 1; or t « n - 2, j u, 

\hJ] “ L 

othenviso, 

In general, the confidence coeificient ot imy ouc-hiiInI I'tmlidnmm iriU'iVrtl 
formed by considering a cei tain order slnLislio of a spccififxl eubsotof i Xi)/2, 
{I < i j ^ n)f can be expressed oa a sum of piubnbiliiicii of the form f-'j), 
where (tj j'l = ^ if (xi + a:,)/213 not included in the apccifiCKl subHht, (t < j) 

It 16 usually preferable to select the subset of (xi d~ (I ^ t it" J ii)i 
m such a Tvay that no two of tho elements chosen necessarily ha\o nn orrlcr 
relation 

Satisfactory two sided confidenco intervals can usually be oljlauictl iw (’(mdumv- 
tiona gf one-sided confidence intervals 


6 Confidence coefficients* Tho purposo of iIhb HOtilion )s to allow lliul all 
the Confidence coefficienta foi oiic-Bidcd confidence mtorvalH doriverl ciu llu* boaia 
of Theorem 2 are of tho form r/2'\ (r = 1, , 2" - 1), AU u mvWmi of 

determining confidence coefficient valuoa for ono-sided roiifidonoo iiU^Tviila U 
developed, 

First a theoiem will bo piesented which Shows that carli of tlin otm aultal tarn* 
fidence intervals deiived in the preceding section has a coufUloimo ea<*fru!uuil of 

the form r/2"^ (r = 1, > ■ , 2"- 1), On the buBia of Theorem 2 it ia mifln lenl 
to prove, 



SKiMlIf \NCh T1 SH I0[l iriL in DIAN 


75 


I'm (iiu M ^ Lfl Ti, , T„ ijc llic ordocd vahies of a sample from the umfo} m 

]iopu}alton vMli Toiler ~ J io J Thru 

Pr[(x, + a,)/2[i, j) 0,1 < i < j < n) - r/2'’ 

whtrr r hn^ one of the values 0, 1, ■ , 2". {The syiM {i, j 1 ^s defined in sedton 

r>) 

Tf-n nt ritdijj- Tins Hicurom ia piovecl hy juvostigafcing liow Llie hyper- 

pltllU'S 

JCa -f t/) = 0 (1 < ^ j < «)> 

Ui{‘ ii-duueuHiouul order sUtv^bu bpivce fov the paiUculai popuktion 
eoiHicloied It iH fouiul that each leltiUou oC the fouu 

+ r,) fi,jl 0, (1 < ^ < j > 7i) 

ch'lincs ti legioii of (he n dimoiiMonal oidci sLatistic space wliieh conaials of a 
(Cllain jniinhcr r of ;i-diruGriaional "Inisie” cells each of which haa an ?i-dimen- 
'•luriiil, "volume" ciiuiil In (J)" A (ieliulecl pi oof of this theoiem is given m 

l&l 

Ne\L a method imII he dovelopeil wheieliy eonhdeuee coetlicient values can 
lie deteimilled foi imy onc-sulcd (ainftdouec lULevval of the foim 

(0) (1 < 1 < J < 71) 

I'oi ihiH jmrpoHe it jh aiifheient to donvc a pioccdiiio foi dotei'inming the con- 
fideitee (■(jcfTirient of any t’OiilideiU'O intorviil of tJie form 

(7) iiiu\ [eoitiun Miliset of 4- 1 ^ i ^ J ^ 

The eimfideiiee eoolheicnl of any one aided eonhdcnco interval of the form 
imu 11 > f/i cinv lieoliUined hy symmetiy The confidence coefhcicnt of any 
other oiie-Bulcil eoufideiicQ mleiwal of the form (6) can bo found by e^cpiesaitig 
llie value of 

IV IK^Jf + t:/) Ih j 1 </>l 

na a Hum of terma of the form Pr'lniax f J < 0 } or as a sum of fcorras of the form 
/Vlniin [) > . Thut tins is ulnays poBsiblo for one-sidecl confidence intervale 

of tlio form (0) is ahown hy dnoct application of the icsuUa of page 17 of [Q]. 

U la tint chirieidt Lo elion that any onc-eided confidenco interval of tlio form 
(7) eim l)tt expreoaod m the foim 

niH\ t3:(?i — A), J[j:(n - K -f- 1) d’ ir(n — Tn*. — /c -f- 1)], ■ , 

^la;0i) + cc(a - Willi < •hy 


ivlicro 


n), 


i 


x(l) P= Xi, 


(t = 1, . 


I 



76 


JOHN h. w\i/Sii 


and mi, ' firiK aie I uitegt^ra auch that 

71 > i7h > fnt > " > > 0, 

Thia IS done by cboosing I, mi , , mj go tliat the two roiifKlciH'o intf'r\ ala ato 

equivalent. 

Thus it 13 sufficient to piovo the foHow'jiig theorem* 

Tbeoubu 4 Let a:(l), , 11(11) represent Uic ordcreil valur^^ 0/ fi vukjyauioil 

ohservalioM droum jio»i popuMions satis/j/in£f fA). (*fwnv' « ah of ^ 

inteffers mi, , such that 

^ mi > mi > *' > wit > 0 




( 8 ) 


'] 


Then ike one-sided confidence tnlerval 

ma^ ~ fe)j — A; d-1) -b 3:(a — injt “ A, -h l)j, , 

( 8 ) 

i[i(70 -h ~ mi)]} < 0, 

where a term of (he form - h -1- 1) + a;(it - — A 1)], (/i 1, * 

15 io he deleted ip n — Ttis ~ h 1 = 0 , Jm the cofijidcncc coefficient 

ifti 

1 + mi + 2 (mi - ti) + 2 2 O'h ^ ij - I 5 ) 

^l"“l Ijwl 1 |m1 

+ •' + 2 ‘ ^ ~ ,) 
it,a-i u“i 

Sketch op Phoop It 13 suITicionfc to consulei ibe i ii'^- m w'lnt’li Iho >t nbm'rvu- 
tiona aieft sample fiom the uniform populftlion with rnngo -'J to ^ ((lien ‘M)) 
Let ua consider the icgion of the it-dimonsionnl oidi'r KlriLialii* simeo <lefiiu*d by 
(8) This legion can bo conahleitd tis an iutevficctHin of n-diun-uKioiml ivkioiib 
each of which 19 completely defined by a ccrliuu re^jiun in on Xf , xi plmiu 

^ H-dimensjonal 'Volume’* of this region (H|iia)a llio 

value of the confidence coefficient of the conficlenco intcrvol (3). 

By Theorem 3, the mteiaection legion of (8) conaisls of n certkin numlK'r of 
basic cells, each of ii^climcnaional ‘Volume'* {\y Tlieorom V IH provml bv 

mwT logions in tlic r,, j:, plimca. u 11 finiiitj 

that the intersection legion consists of 


rtfc 


1V wii V * +2 

U,l-1 


.2 (mi — ij * 


h-i 


” u 1 ) 


^ denvftLion of this oviircMjion ja givnji m (5|, 

^Now consider somejvimples of tho ftpiilicdUoii of Theorem 4 u[ u U 
vai ’ ■” '"^3 - thcoUe.Hutcd conhdiuicomler- 


__I K'lll + tv), l(t|g + I.)] < ^ 

> For thg trivial «asa i„ which t = ,i ihc valuo ot (0) i. unity, 



M^MHC V\U 'JlSlT VOli Till, MH)T\N 


77 


lun ii I'DiihfN'tn v < (ii'lln ii nl ('fnutl t(> 103/2" If n = 12 iii^tond of 11, the cou- 
(iih'fiH* uii'lln i( nl III* 10,1, 2'* while ihc coiifKlonfc mtoiViil Ijcconics 

iilJiK |i», ifrjj -h Ja), K.iu -h .ra), K^ii 'h ^i)l < 

j\s iinollwr esumpli'i let u - 11 uiwl eonbider the cunhtlmcc intciviil 

|«3, jff’a h 1 * 7)1 -f' JCfi), H- ai)] < (jt 

IJeic / 3 find 'nfnijau-^iri with (8) shows thiil this confuloncG inLcival eatia- 

(itrl rtiniri'iJi I wiili j;ii 7, wh = 0 , /iij p= 2 I'lms it 1ms a conlitlenco coclR- 

ni III cHju il o> al 2 " 

‘riu-oKiii 3 ihiiL (M(h (»ie-si(lfd (onrKlGiiec jntervml dovolopcd on the 

of 'rimurnn 2 hu^ n iMnilidi'Uic eoelhcicnb of Ihc foim r/2^, (0 < r < 2“) 
The nuemuni au'-cs as Co whelUr'i Ihc one sided conridciiec mtoivnls defined l)y 
'lh(OM‘Hi 1 li!t\c I iiiilulcin c unlhciciitH which uLLiiui each of the values 1/2''^ 

2 2", , (2'‘ - l)/2'’ Tlml tins h not Iho case JS piovcd as follow-a. Tho 

(ul.ihiv of (liflcrcut roll lidcncoiiilci Mils of thcfoim (8) is equal to 2" — 1 This 
H showII hv f'oiiiiMiiK liow jn.itiy wavs the intCRCis »ii, , wn cun he bolccLed 

buhjt'Ct In the naiililnnis a > m\ > nij > > Hh > 0 It 19 easily seen that 

llici«‘arc pic-hihlo ways Siiinniiiift ovci the iiossililo values of I yields 

2 '* I Tins (iRUh* 11 iiiMcascil to 2'* if the conlideiice inleival tn < (#> is also 
jiichidrd JAanmmlum of fO) uliows, liowcvei, that two dilTeiciil selections 
of ail , , etc , will u'huU 111 ihc ‘smn* viihie of cO) for inoi'G than one case. 

'I'lnc^ (he one Mdcd coiihdcncc iiilcMuls of Thcoicm 1 do not have conridcnco 
imicJIh iCoH whnluitUin I'ach of llie \ahiL'S 1/2", , (2" - D/S" 

Although the chiHs ul niic-^idcd lonfulcncc intervals dclhicd by Theovem 4 do 
not have conltdeiiic cnellicicnts wliiih ullaiii each of the values 1/2 , 2/2 , ■ • , 
(2“ ' 11/2", thej do ha\c aiUiLhei piopcitv wlueli la impoitaut fiom a practical 
puiiil of view If a ecrliiui (oulideiire loeditioiU can be olitaiiicd foi a paiticiilat 
value of II, then tins coulidoin^c coefficient can also he obtained for all greater 
values of m Tins icsidt is a coiiHcqueneo of the following thcoiem 

I’m oni M f> Lei ^0)» ’ 1 ^^0') orderwi mhirs 0 / n ladciJcJideiU obserra- 
/leim drrncii /rem popafirhons badd/f/intf coridifioas (A) 'iVica if a conjidcacc in¬ 
land iif Ihrjorm (8) hu,s (hr cotijuknvc fOC/Zicicid tfor a tei/uia vahic no ofn, U la 
ahuun ptfmidr (0 tthUiin ani)0itr conjahnee intcmi of Ihc fom (8), which has die 

ronjhhirr racffivirul t for the laluc )io\ 

PitooK. Let m,, • , an bo the mtegoiH con CHpoiiduig to the given coiindoiice 

in lei sal of fnim (8) Tiicst' iiUegciH satisfy the coiubiion 

7 in > 7 ni > nia > ' ifh > 0 

Li'l ,10 Ijp icplHCca by n. H- I ami consulei' Ihe nea set of lategcis (m, + 1), 
(m- HI),' , C»n -t- 1)> 1' fivitlcnlly 

n. + 1 ^ ail -f 1 > 


> WH + I > 1 > 0, 



78 


JOHN r 


Hfitice these integers can be used to define it confidence iiitci^ivl of the furui (8), 
Also it la easily veufied that 


mi+l 

1 + (mj + 1) + E (?fti + 1 ^ 1^0 

rf'l+I mil-I-fj" 

+ + L " £ + i “ II - 

1 1—(ft '“fl 

+ 2 E E (»n - I - 1) - • • 

;f=i ‘1-1 


1* j) 


-* u) 


- Illi ^ 

= 2 1 + wu + £ (flit ■“ I'l) + ■'' + Z) 
<1-1 


iflj-fj- -fi-i 

Z (flh “ li 

I i*-i 


u .)] . 


Tinifrthe ne^v confidence interval has the samp confitlcucd cocfllcictit im thn K‘vt*u 
confidence 'interval 

From.symmetry considerations, the onesided conndciico mLerviil 


mm la;(A- + 1 ); d' + ^))r ‘ > Mr(l) + + Oil > 4>t 

where a term of the form -H + /i)]j (h 1, , k), ia lo bn di'lntinl 

if fliA + A = + b hfts the same conficlGncc coefiicjcnt ns (lie nm* sidH <'tm' 
fidence interval (S), i e. its confidence cocflicieiit is given liv (0) 


7 /Efficiency of some tests based on conditions (A). Lt‘L ua coiiHidcr tin* t iij«i 
m which the II obsei vations used foi a test arc a sample from a norniid ptipiilalioji 
with unknown variance The purpose of this Bcclion is to invcaligaUi Urn ellb 
ciency of aome teata based on conditions (A) for this special lmiho 
T he method used to obtain effioienciefl is outlined m section 3 Only tiiiv-sii It^l 
and symmetrical teats arc conaidered^ Foi this purpose it m suftvcuMit In limit 
mveatigationg to one sided teats of <#» < 

If the subset of “h ^i)i (1 ^ w), choacu for a lest la nut of une of 

the forma 




(t < j). 

(• < J <* k), 


(b) ^ix^ 4- a!/), 

(c) + v J - .V,, 

the determination of power function values rGciinioB u numerical iloublo or liiglier 
order integration Such numerical mtcgraiioUB arc oxlrerncly IcngUiy Kor 

this reason only one-sided significance teats based on subscU of liio forma (n) (c) 
will be investigated 

Let the normal population have variance and consider one aided ltwU\ of 
P <<h based on subsets of tlio form (a) Then 



HIOMHCA.NCB TfcSTf4 FOU TIEL MUDIAN 


79 


Pown Fumlmn " /V (a-, < i/)o) 


\'.hcrp 





ff / 1-1 




d (iftO — 0)/tf, N{d) = j ^ (^1/ 

TIk' ixiw (T fniK lum vuIuch lihled for tliu teat a:, < m Table 2 were computed 
from Lhc afuivo cxprcHHJon, Tlic concapoiuling valuca foi the Meat wore com- 
puttnl from the jmrmal uppioMiimljon givon in [2), 

For flubfiola of foima (b) iiud (c) the expiossioii foi ilio powoi function la moic 
compliralwi aiul wiU not be eiVliei driivcd or staled here. Foi any particular 
case, however, a fiituplo anulyBia will yield an expression for the powci function 
wliifli lecpmeH only a hrat older iiiinioneal mtogintion General expiesaions 
foi the powoi fimolions when llic Huhsets arc of the forma (b) and (c) are slated 
and doiivpfl in [fij 

Table 2 oonluins pimor funclioii values and oITiciencies Coi seveial teats based 
on suliKds of th<* forms (U) and (c) The powci fviuclioii values weie computed 
hy appioximiito inLonraiKui (Snnpaoii's iiilc, etc) The/-test power function 
values wore oliliiincd by using llic rioimul apjnoMmalion The power efTicion- 
CIOS htilod 111 'falilo 1 foi losls whu‘h do not appeal in Table 2 wcic computed in 
[hj, wlioio a liiblc of powf'i fuiKtum viihiOH is also given 
I'Waimuulion of 'ruble 2 shows that many of the tests [oimcd fiom subsets of 
lypes (b) and (c) are vciy odioient foi small values of a, TJie efficiency appeal a 
to dcoiouBO iiH a jiiLroimeH Also the efficiency of n test depenclB stiongly on the 
subset of + ^-h)j (I < ^ ^ j < u), imcd to foim the test For example, 
ot a => 10 The lost 


Accept < ft>ii i/inax [r» > ^(^^ + iio)l < <^0 

has a significanco level of appioximatcly ,01 but an officioiicy of only 82%. 
IIuwovoi iho lost 


Accept </> < 001/ maxla-fl, + iiio)] < 4>c 

also hius a Mgiiihcaiice level of approximately .01 bub an efficiency of 96 5% 
An appioMiuato set of lules foi picking Buhacta which lesiilt m efficient testa 
of 0 < 00 in suggcHtod by lli(' leaulls of Tuhlo 2 Lob a( 2 i), ' • > a:(ir) bo fcho 
order wlulisties winch makeup the elements of the particulai eubaotof + ^i)} 
(1 < t < J < n)^ to be need for the test The approximalo rules are 
J Uho the nmvinuim of the values of the clomonta of the subset 
2 Choose ii, ' ■ ,ir bo that maxfti, • , ir) “ H and min(ti, , tr) is aa 

latgo as jmssiblo subject to llic lestnction that the test la to have a aignifi- 
can CO lovel of a specified order of magnitude 
SymmBti*y considevatiouB furnish the carieapondmg act of rules for obtaining 
efficient LesLs of 0 < 0o. 



80 


JOHN J WAJAH 


Othei testa at appioximatoly the sume sigmlicaiico losci . lull nut uii siili. 
sets ot Oie foima (a)'(c) a.e undoubtedly inor,. elTu lent Umii iivniv ul lin luatH 
considered m Tables t ond 2 (paiticula.ly for llic loiijer values of u; ( uiniuila- 
tional diffloulties, hovvovei, prevent conaidciatioii of moic griieud ‘■ilii.iliuns 

8. A general solubon.* A goiicrnl method of ohtaiuing oiu' sidnl (fit, of 
0 < ^ and ^ , also aymmetiical teats of 0o» on llif I'uaia of eoiulif ions 

(A) 13 the following 

Let vi, , ?/ti be /I independent obaerviitiniia (\nm\ finnv piipnlnlnniH 

fymg eonditioDS (A) Let 

= Vi - 00 (j ' 1. , HI 

If the null hypothesis of ^ 0o la satisQetl, each ii an obircrviilloii fnnn r ikiihi- 
lation satisfying conditions (A) witli 2 ei o median. Goiisukn thf 2" mn ^ of \ ulijca 
obtained by the transfoiinationa 

s, , (t - 1, » n> 

wheie i(i) la one of the signs + or —. Form the iiioan of cui’h fif the 2’^ a4‘ls of 
values Then it la readily aeen, from coiiditiona (A), that tlio prolialnliiy Mini 
Sg,/n) 13 less than the (r + l)th largest of UiO 2“ nu* ms has Uu- 
r/2" when the null hypothesis is true Similarly (lie pioliulnliLy (hatis 
than the (2" — j')th largest of the2" means la equal Lo i^/2^ if iho null hj pullieHiH 
of = i^a is satisfied Thus the test 

Acceptil> < ^0 g is less than the (r + l)//i largest of the 2" m/'fims 

18 a one-aided test of ^ with sigtiificaiicc level equal to Liki'Wjiv ihe 

nne-sided test 

Accept i> > 4)0 if z IS greater' than the (3'’ - r)th largest of the 2” wicaa^ 

has the aignificance level r/2^. Consequently the ayminelrictil tcJst 

Accept ^ t/ 3 IS eiilier less than Ihc (r d- l)th largest or pruiftr 
than the (2'' — r)th largest of lltc 2" niean^. 

has a significance level equal to 2r/2" 

The application of any of the above teats requires tlie com|Jiitutu>i> uf tUn 2" 
means and a determination of wheie ^ falls in the ordcimg of If 

n = 5, only 32 means need be computed If n 10, however, 1021 meima mu at 

be computed. Evidently tin? tost is too cumbersome to iipplv uvueiil for v' 
small values o£ n 


9 Acknowledgements, The author would bko to p\|ncH.s hiH tii)i>nM mlum to 
rofeasoraS S Wil ks and John W Tnkey for valuable iidvice uml iiHaialuuec in 

7' denvod mdopectJcntJy hy hi J. G Pilmnn and ll.o author The fun, 
aTncTitftl idea on winch the solution le hnseci ima prcaentocl 1)\ H A. h lalier in |7|. 



suiMnt’ANf'L Tisrs ion Tin: mddian 


81 


tlic "f *1"*^ l(j Mrs Iliilh S Sluifci foi computational as¬ 

sist iiin«' 


IimUl'lNCKK 

11) Jims I M n, " \|ijiliL<itinnH o( flojiio HiKmlinuicc towU for tlio mcdiiiii whicli aio valid 
uiiilrrvf>i3 iicnil rniiditiftiifi,’'Miiljinillod lu Ajfi Slal j-laart Jour 
[li| N' Tj Jiijrs'iHs \si» H I, Wi i( ii» ''ApjdJifLtioiiH of tlio non coutral 1 diaLrjbuUoiij” 
n, A (»! <4 (ll)l(l}i p 1170 

(l) Jnjfs I. "tin lln‘]i<iu(‘i fiijii iioji of tho8l^^n LraL for slippage of mearSj" Aimafa 

nf Mnlh SM , Aol ]“ (101(1), pi. 3(10-J01 

(l| JI ivitin-f AMI J W Tdk*^, "Non ]jnrainctnc ostimiition I Vnlidatioii of order 
Hi iliMlK 'lnnnf*io/ Math litol , Vol lO (1915), pp 187-102 
[51 Joiiv )1 WALtii, "Somo BigniriciincG tesla for ihc rncdiiin ivliioli me valid undei very 
gisirral conditions," unpiiblislicd llicsm, riiiiccton Univoieity 
[Gl G UiiMC Yiiih AMI M n Kkndau-, An r«irorfuction to ihc Theory of Staltnlics, Griflia 
uml f« , 1017 

[7] U A. I'tniiMt, The Dciiyti of f'xpcrnticnto, Oliver cindBo^d, lO'lS 



A DIRECT METHOD FOR PRODUCING RANDOM DIGITS IN ANY 

number system 

By II Buni^E Bortov and 11 S^nii HI 

Interslale Commerce Commission 

1 Summary A compounding tcclini(iUo firriL uwhI to iiuxlnnMiuidnin Uinury 
diEi’ta IB geneialiiwd Mid oxtended to other number synleinB I'ormiil'.e fur I he 
rote of oonvergenoo of piobabditics to the denin^l voliu^ ore . CriMil Il.o 
meUiod 18 extended to the pioduotion of landom diRiU uilh ns.'d tint uii(X|iit>l 
probabilitiea Numcncnl icsulta me ptcBoiitol iii simuniiry form toReOuT witU 
results of tests applied to a set of random digiU proihipcrl bv (ho iiirlh.nl 

2 IntroducUon In a note [1] by one of tlio auihoiH iv uinlliiHl i*f |n*NbinnK 
random digits was piesented- The method waa IntHCtl upon a 

"compoundrandomiaation/’ used to produce iniidoin liinary digils who li ran be 
converted to random digits m othei numbei syfilcnis by Hirniilo nielluHJH De¬ 
spite the ease of convening a random binary seucH to anollier Ur ih of 

interest to examine the problem of direct prodiietiiui of luiidiMii digiih in any 
number system In the course of producing random Innary digitn w itli mat lime 
tabulating equipment, and while designing an olcrliunif dev ire to proihice ran¬ 
dom binary digits, it was noted that the miiltiplicatum prorcAH in ilm 

earlier paper was the equivalent of addition modulo 2 of a mm of Innary dimlfi. 
This observation kid the basis for genorahwng to other nnmbeif hj * 

3 Imtlal conditions and notation. Lot ns assume iliul (hem is iwnilubh* a 
source of digits, 0,1, 2, ■ ■ ■ (ti — 1), in a niiiiiboi syatem of bum* where n- jh a 
positive integer, n > 1 Let pri repiescnt tlie piolmliihty of ohiainiiiK tlu‘ rlh 
digit in the shh tiiai Asaume that initial conditions can be loiilroUtMl ao that 
the trials are independent^ and 

(3 1) pri ^ i 

where 0 < < ^ 1/a is a fixed positive number (It may be iujtcM.1 at this point 
that conventional 'Wgle-stago” methoda of producing random iiumbprs arc 
based upon the assumption that *" « “• i/n) Tri rcproMMit ilia i^ruh- 
ability of obtaining the rth digit by addition modulo n of tlio ihgiU ohlmriorl m 
«individual trials In older to express in torma of p,, couhitlor Ivso sola of 
matrices whose eleraenta aie defined as follows, 

^ Inaoliingftflrefereefoi [IJDr GoorgoW Drown Biiggcfllodgcuioridiniijg to illlirniunibl'r 
Byatema by addition modulo n, 

® J E Walah [2] has considered, in toima of conditional nroLiftbilitica, ihc I of lUltr- 
con elation on compound raiidomiznlioii in the binary ayalem 

82 



ntOlUriMi lUMlDM DHjIT'j 


83 




JJn 1 1 /)n -a r ' 

Pi . 


2 ?i. 

PO i Pn~-l. n 

Pi . 

R 2 ) 

" ]h t 

\ 

Pi » Ihi 

t 1 

Pi. 


il 

Ii P '1 1 

t Vn -‘i 1 Pn- J 1 ' 

Po.s 1 


TTq , 

TTn^l J 7 r,i a * 

TTl,, 


jj j 

TTfl S TTii 1 s ' 

ITl , 

(3 3) 

OTi ' iT’i , 

1 

TTl 1 ’’’’O 1 

TTa t 


J 

1 

li l^n 1 

1 ■’I'n '2 t 

TfO 1 


X<lIU' tliul wuL «i iu(‘ miUimc'H wiUi t^^o Ewlditiemal tcsUictiona (1) 

lUhii' 6iT<Mt»» /;‘ro iwul (2) (i\^-\\qU A'ilo^^) sums [ii'Gunity, Each 

7 i X 11 iniidj- 11 )) of nidy n (iBtincL dements, namely, the n clilTeient 

[iralmlnlitir-' im(U lUe &0i lual for a ,, or the n dilTcienl probabilities 

iiUerl \u()i dll' •-iirn of n tiials foi a, 

4, Relation of vtt, to ]ir. Ashuhuh^ iiule|K'iuleut U'uUs, \\g have the following 
idiitmndujis 

Ol Oj , 

f)l3 eu n-i Cti ^ (h Qi , 

(J I) rtfj ni ofj J flj Oa Oi , 

r * • • 

I 4 

k 

tTA ftji I ^ n 

I Ml 

Thus, >fiin'e any row (m any eoliiinn) of * 1 1$ ii permutation of the , by (4 1) 
the fffk ure I'XiirewMisI in leuiia of tho indii'iiluftl pioliabihties, 

B Convergence of TTfj to Tn. (i 01) Tulduom* = l/n 

Piioor pi fli'fiuLc the riiiitito of the elemotits of <x, Each dGmcnt of 
ffi is a neiKhlwl mean of die a (hHUtioL deinerila of ai^i I'hc w distinct elements 
of cii lire UNii iw \vel^lll^* iii the iiveriiguig inocess- Now tlio iniigo of a set of 
weiglileil iiiemiH (wdglilrt > 0) of a flel of vnhu'S must bo loss Limn tlie imigo of 
llie valnehv ihenuwlve^i, both luugi's me zoto Thciofoio, smeo the woiglita, 
> 0 hv iMiudiiiou (3 1), 

(r» f)2) Pi p, I, foi p, I f)i or in the fipccinl case p,-i « 0, p. « 0. 

AEOjHJiu'e Jl# fl'fx ^ li 

r«»fl 

\vh.k. lln« M « nl«.,K pubUcutio.., J. WoUotwU ,nilq)ii.Kleiilly proved thoorom 

(SOI) 



84 


H BUmCE HOTirON AND li* ‘^Ndlll II! 


(5 03) ijn - P, ^ ^ri ^ H- Pi 

In oidei to show that p, == 0, and to rloiivc (ounnhio for Llio rah* of < on- 
veigcnce of to the limiting value, l/n, let w. lepuscnt (he oHlneil p,. for any 
given s uii -- the snmlloat , w,, = the largest of Ihe p„ In u ‘.imiljii 

KianneL lot aii lopicsent the ordoiccl Ihe folloiMOg iiienualilies for (he 

lyiaMmum and imnimtiin TTrj can be Bct down imiiushately * 

(5 04) niax7r„ $ w„<Xn + Wn-i !Cn-i d' ‘ ’h » 

r 

(5 05) min 7r„ ^ Wn + V)n^\'Xi -f -h wi A 


And since p, = max ir„ — mm ir^ , 

r f 

(5,06) p, 5 to;,(£C„ — Ii) -f “ ^Cj) + ' ’ ' + — Xn~\) + IPi(Ti ^ 3^). 

For n even, let m ^ 7 i /2 + 1, then by regrouping torma, 

p. £ - 'Wi)(a;n — Si) + ” W»)(trt^l - tj) “I * ■ • 


(5 07) 


"b ('nim ^^w-i)('r« 


Noting that p<^i =s= (xn — xi) ^ {xn^i — ffi) ^ ^ (iw ^ Llie follow nig 

substitutions can be made 


(5 08) p. ^ (Wn - nJj)p.-j + (iPn^i - wi)pi-i H- ■' ■ + ((a„ - Wrt-Opj. i, 
Foi compactness, tins may be written, 

[ n n-l “I 

2 IP, - 2 TP, P,_, , 

f-D| ftfl J 


Similarly for n, odd, let in « (u + l)/2i proceeding in the aamo manner im nbovn, 
the median teim vamahea, yielding as a final result, 


(5.10) 


Pi ^ 


11 w-l 

2 IP, - 21pl 

L<”iii+l i»l _ 


Pi-l- 


For Bimplicity denote the expiession m binckots by 5» , llicii 

(511) 


where for n even, 5, lepresciita tlw aum of the, liirgosl ii/2 i,{ llic t„i„iiH llm 
sum of the smallest n/2 of tho p ,,, and for n odd leprcsciilH llu' ainii nf (he 
largest (a - l)/2 of tho p„ miriua tho sum of Ihe smallest (n - l)/2 nf the ji,, . 
Continuing the process developed above, wc find that 


(512) 

(513) 


Pj ^ ffi 6 ,_i ' Pi -5 I 

t 1 


Pi ^ 5 , 6 ,„] * . 


■ Pi I 



rilOBUClNH llANDOiL DKjlTrt 


65 


Since S PI, the follo\\ing Bimplc moquiility holds 

(3 w) p* s n 6.. 

Now 5, =5 1 - nc, by condiLioii (3 1) and the dcrinition of 5^ Thcicfoie, 

k 

('> h'i) hm PA < lim 11 < Jim (1 - my = 0, 

A-i w I "I k-iw 

and (v) 01) 14 |iro"vci\ In the bpeaiid case of constant probabilities fioin trial to 
tiiid, 6^ = Sq , a {'oiistmitjUncl (5 bt) becomes 

(S Pk ^ (fia/ 

SiiK'o llio mean ttm m 1/71, wc have the following useful inequalities' 

K A 

l^) lAt' ■" U 5i ^ TTrA ^ 1/n “h XI fij| 

1-1 1-1 

in Iho enso of vaiying ])robabiliLies, and 

(5 1-3) 1/n ^ (6a) iTrA ^ l/n + (60)^, 

in llu‘ cuKe of eonaliint probiilnhtics If S, is not known iji onch trml, an uppei 
bound, fit, may be estimated on the basis of Iniowledgo (including statistical 
tests) of llie digit Ronci'ating pioccsa Then the following inequality will hold 

(f) ID) \/n — (fii,)*' ^ TTffc ^ 1/11 + (6ft)*, 

uheio fit g (1 — 71c) 

It IB woitliy of noLo that inequalities (5 11) and (5 15) become equalities if n =« 
2 (biniuv system), thuti, 

(5 14b) Pt = II 5< = II1 73 . ~ I = n 12p4 - 1 I, 

•-1 i“i 

(5 15b) PL ■= (fio)'' = I p - (7 !"■ = 1 2p - 1 |\ 

These ipaiilts wci'o obtained by different methods 111 [1] 

3, Discussion of results. CV] lam facts nio implicit m tho foicgomg analysis, 
but tvio worthy of mention m piiBuing. The compounding piocesa may consist 
of mldition modulo n of digits taken fiom amimbei of digit-pioducmg machines 
If any niacliine, h, is pcifccL, 1 c,, prA =* l/itfor all r, each element of Lho piobabiU 
ity niatii\ an will bo equal to 1/^b und pa =* 0 Consequently, each element of 
a,, s ^ hf will bo equal to l/n by (5 17) and the special case of (5 02) Thus 
any roinbiniiliim which contams a perfect machine is perfect. This is equivalent 
to a 1 cslatemcnt of Von Misoa’ [3] icquiicmcut that the sum of a landom set 
and any oilier set must itself be n landom set Finthermoic, by (5 02) the re^ 
suits taken fioni any maohino, no matter how neaily peifect, can be improved 



80 


H DUUKB jronTON It HMlfll IH 


by combining with the lesults of another macluiie, no innltrr lio’i’. Inn'll tlic 
IfttUi may be In the limiting case, p„ = 1 (m 0), llu' iiroliiibililios of (lie van- 
ona digits aTemerEly interchanged 


7, Production of random niunbers with fixed but unequal probabilitiea. Tlio 
principles presented above can be adopted to the prodiielion of ruiiibiiii iiiiinhf'r^ 
with unequal piobabilities as follows! Assume tliat a ni't <?f riUKlniii tligitSj 0, 1, 
2, {n - 1), J3 leqiitiecl in a number ayatom of base n, \sjlh piolmbililu h , 
31 , ga, gn-j, Z'J-o g* = It 'vheie each qt is a proper ralionul frm hoii wlut )i 

may be written as the quotient of two positive inlegers, gj === \ ('liooso m 

^ I 

as the basis of a new number system, wliero in la the least common iiuiltiplo of 
the V {, 


(n) 



m 


A set of raudom digits, 0,1, 2, (m - 1), in a iiurnber By«Leiu of bfiM‘ 7ii muy 
be generated by the piocess described above, oi a set of such digits may In* i on 
structed by entering an existing table of random digits, Iiiiw' ri, find inliTprcImg 

appropriate numerical quantities, base 71, ns digit symbols, Imwi rn hnmc 

I', 

13 an mteger, groups of digits, niiij, mui, ’ mun-i, in Iho vi ayslyin iniiv bn 
coded BS digits, 0,1, 2, • (n — 1), in the n flystem An upj>er bound for Mm 

npaximum bias of gi will be —'■ pt, where pk is Ihc range of ff,* m the vt Byakni 

Thus, by increasing k, the bias of qt can bo made amallcr than any prcnaftigiiiHl 
quantity 


6 Convergence under more general conditions. Convergence of jr„ lo 1/ji 
occurs under a variety of conditions less restrictive than (!1 1) 

(81) Theorem In ihe case of tndepcnckM a necessary and Huj^ictcfU 

condition that \m ITri = 1/rt ^ tj where t u a/tied poatltvt nnmbrr, 

arhilranly mall, and I is afned positive tntcffor, arbtlranii/ larf/r. It ih oIjvious 
that (8.1) is a necessary condition for conveigonco. To prove tlniL it. la u buDj- 
cient condition, conaider the following', 

(8 2) Lemma. If ^ where tj is a fixed posilivo number, arbi/rmti/ mtdl, 
ikenhmirrM - 1/n 

i-teo I 

Proof Take a fixed integei, A, A i - i, Now any dipt, i-, eau lie oljtumMl 
in at least one way, i e,, as the sum of r ones and (h - r) xeros. Tliercfoio, 

TTrii ^ r, where t >]* 



IIANJJOM DKjIri. 


87 


\\ iioVi I ('Kfird li InulH as a trial of a Euraplex mnohmD Let« represent 
the nuinhei of wich i ohijilex 1 rials lait ttJu rcpicaent the probability of ob- 
tjuriJiiR IliorDi fligitas t)u'result of n(]« I itunimodulou of ic complex trials Then, 

(8 ‘0 '>^1 ttIu -4 Jiiii H 1/71, 

M-’"! li-IM 

liy (f) fli) Now it nh 1' j, (1 g j < h, (j mi integer), or ^ wft + j < 
(u -1 l)ii rlie j Hiinplo Inals ciuniot iiuTcaso tlio maximum biasi by (5 02)' 
niiiMsiunilly, 

fR r^) liiii fff (nA I i) Inn TTr (nil + )i “ 1/n 

w (uS !• 0-*“i 

t^mre lliere is it mu‘-lo nni* uirrrhpondrnce between the elements of ja) and 

I m/i d ; I» 

(8 (’,) bin TTr, s= i/n 

Hy n miluriil lAlonsiori (if llie Icmiiui, we may regard C trials aa a single com' 
ple\ trial I'litorein (K I \ lUurt ui^jiimea \he foini of (8.2). 

D. Numerical rcRuUti In various number systems, Mote efficient convergence 
(oriiuila*‘ Wo tl»'iii*^Hl to ima'l H]U'eml lomlitiuns, Thoeo presented in (5) 
Jiini' fim Himplinti iiml guiicrality To teat the efficiency of 

(fi Ifil H* \t ml minu'ru al I'xmnplv^i, liioMsl npuiv tmuBual hypothetical pi obabihtica, 
arre wmki*!! Wy umlnx iMitlliiilnulion iw m (*11) In LLcbo problems pr, = Pr t 
ii i.-niiHl 7 mt, frmii Irinl Iri’tl A Uilmlar (mnpanaoii of the ranges, cornputecl 
hv (d 11, and iho tipp* r honn^l^, iloli'nniiu^ii by (b.lC), is pioaentcd m Table 1 

for h ^ in 


10. Preptttnilon and ICBle of a sei of random dlgita. ^ Smeoan unlimitedniim' 
her of valid W*r niiirlutiiria^ Jimy he deviBccI, it is obvious that any finite 
«iti of dtRils ciimud iiua^l dl witdi Aa a mutter of fact a tiuly random ptoc- 

m Bhimhl >ii'ld wda wlntdi/aii to norne proportion of the testa, the fraction 

hping ilclo^iint^l hv lln* M of MKiuln-anic adopLcrl in testing No finite set 
(if iliRilh m\ Wo i im^nlt^rt^l r.Midum, lUo teata for ramlomneaa are really apphed 
Ui (Icloniiini* llio »ImrnHor of llio gi-iimiling proceaa. Ilowovor, the concept of 
'“lucidly ramlum" avta ok dcytdupr^l by Keiulall and {5niith l-i] la useful, and some 
of ihi’ir w bIh jiru imA Wdow>««uviduiu n llmt a act of numbeiB produced by com- 

poiifid rninlmitiioiffuji im hk'dv Wi hu lurally riitnUnn. 

A nuu riimlmu uf WU) iW<\\Mn\ iliglla having lUo rolutivo froquenciOB 
iiidiraii^d 111 ihv liiir of 'rnhlr 1 'Mu* pmM m curds and abulated. 

ToialHu.-ru (akin U^ruitU ii-n nmUmid tlmumminim 
r,nmur nil ... « v,.r,l, tU..rfl.y i.r«<Uicmg 

Tl.e fr«iun.. ..-h ,.f .(.ftiN ... llir .l.rh.'.l wl «!■.■ <-on.i.«rcd with ihoso o th gm- 
crulmg VK< III rnWlv 2 TliP UiH\i\mm of the derived set aio in accoid with the 

hyjHilliaKiri of erpuil proWnWilUlf^* 



TAIJLI'I I 


oj WUC m,I/«n,i„((.;ur .u„x.,.h™ f - IM 

Hy-poltmliMd nmiimctil 


Nuth 




PrQbqblllly In nn IntJivIdunl Irml 


fa&se 

pt 1 


>■ 


pf 


pt 



P* 

2 

800 

200 

— 



— 

— 

— 


— 

3 

500 

300 

200 


— 

— 


— 

— 


3 

070 

020 

010 

— 

— 

— 





3 

400 

300 

300 

— 

— 

— 

— 




i 

200 

100 

400 

300 

— 

— 

— 

— 



5 

050 

200 

400 

020 

330 

_ 

— 


— 

- 

6 

080 

210 

300 

020 

200 

100 



— 

— 

7 

300 

020 

240 

050 

130 

L70 

090 

— 


— 

8 

200 

050 

OGO 

180 

lOO 

090 

150 

no 


— 

9 

030 

,oso 

160 

000 

140 

000 

100 

050 

,210 


10 

050 

IBO 

200 

050 

050 

120 

080 

020 

ISO 

.IDC 

10 

010 

020 

030 

010 

050 

000 

070 

080 

090 

55(] 

IG 

no 

no 

no 

no 

no 

no 

no 

IIQ 

no 

OlU 

10 

160 

160 

160 

160 

160 

060 

060 

060 

050 

050 

10* 

014 

171 

164 

181 

023 

095 

017 

205 

039 

OiW 

12 

010 

070 

120 

100 

_ 

060 

020 

090 

010 

080 

lUi 


(5 « 


rA.j 


/'i 


_ 1 _ -< r l><Hi'Mrrill»l> I l>(KJ 


^ I j <n«hw(IM’pT 
— r M'dltril'Jtiil (tOj'-LN.Hrft I 

(KHSHUHHWM Uni 

1 

(KHKwivvr.s I 11^1 


INHj 7^7^177 iH iiipvs iri^) 

I 

(HKKHU‘<!7J' I^H»UHVtI7n' MXI 

i 

(KHJl7r‘*'^ll' 1 Mil 

(KHKKKHr'M''i| ' tWi 

(KKKXJ.WU"'' i7o 

t 

lH)l>t)l,lLJ(*l)J IH«lH7rrpi<„*’i , ,7X1 

(XiU'Vj^j2l'' 7'XI 

' ^KHKNKHMKU <X^XHxxhmU |iMt 

I (Khxhh»'Mh' iMn»'rpir,nj’’i '■jxj 

(KHMriOIMli l)MI74')^»|tk fiiM 


000 IfKt (KXKKHLWiO IHxj't/O.'rfpJ’i ,V«J 


* This badly biasod sot of piobabilitioa was used to ]ii ndiicii llu* of rniHl«*rii ifrr'niii'il 
digits tested ID tbo next sootioo 


TABJJC 2 


Digit 

0 

1 

2 

3 

4 

Generating get 

014 

17i 

104 

184 

023 

Derived get , 

088 

112 

080 

lOO 

.113; 


S 


0 


Frequency test (cleiived set) x“ = 7 0 

TABLK 3 


18'1 023[ O'KV tll7',2tl“i (ISU (HbS 


fi,’} 


Tth digit 


0 

1 

2 

3 

4 

5 
G 
7 
S 


(t + l)lli liigiL 


11 

10 

11 

g 

Q 

9 

6 

13 

7 


8 

13 
10 
10 
12 
17 

14 
10 
8 


7 

IfJ 

7 
3 

10 

11 

9 

9 

8 


3 

i 

6 

II 

Vi 

‘ 

^ H 

^ ‘1 

1 

7 

b 

7 

12 

12 

n 

K 

9 

n 

J 1 

11 

H 

111 

‘ 11 

10 

10 

7 

0 

0 

i 7 

' II 

14 

12 

17 

U 

« 

II 

1 12 

10 

19 

U 

10 

U 


I 7 

14 

10 

0 



r( 

p 0 

9 

14 

10 

13 

H 

; 

! 10 

9 

8 

11 

7 

12 

7 

i 42 

n 

9 

tl 

ni 


1 I/I 

' 1/. 



I'llODl IIANDOM Dirjua 


89 


III tbo nf-iJiil adjiKTiib piur« uf (ligils lUC Ubulatcd The diabubution of 
P'liM 111 Hu* di'iivrd vit aiipciiiH in Table 3 Tina teat mdicateg that ad- 
jud'iil dif^ilH jiir imlr^j)('iidr'nt 


'rAllT.E -1 
Gap fckl 


j 



l^cnglh of gup 




1 


0-1 j 

2 1 

5-7 

R find 

OVQl 

X® 

P 

1 



FrcquoiicicB 




n i 

1 

( )Ipm*i\ rni 

I'A peeled 

IG 

lG.r>3 

18 

19.10 

11 1 
13.92 

42 

37 45 

1 25 

75 

1 1 

I'Ajieeltvl 

27 

21 00 

27 

2‘1,37 

21 

17.70 

3G 

47 78 

6 44 

16 

2 

< dihi‘rv(Hl 

I'lviiei fdl 

10 

IG Ifi 

17 

18 GO 

10 

13 GO 

42 

30 50 

1 90 

.60 

1 

1 

3 I 

( 

llbd'niHl 

1 I'Apieleil 

1 

JO 

1 10 70 

20 

22.83 

18 

IG.fl'l 

41 

44 77 

00 

92 


1 

1 

i OliKi'ixed 

1 l’Anee1(»<l 

1 

31 

21 28 

17 

21 , no 

20 

17 02 

44 

48 21 

7.39 

.06 

5 

f 

j f)bm'r\i*<i 

Kxpei led 

IG 1 

JO 10 

21 

22.17 

15 

1 10 10 

60 

43 48 

2 04 

.67 

b 

1 niiM'Hixi 
hAjieelefi 

\ 

j r)bber\nl 
j 1'Apei‘teil 

27 

10 00 

25 

21 05 

12 

1(> 00 

30 

43.05 

5 95 

12 

7 

20 

18 13 

10 

21 20 

10 

15 52 

42 

41 70 

40 

.93 

H 

1 { )bH( 1 ic‘d 

1 Ivsjiei 111 ) 

11 

18 21 

10 

21 07 

21 

15.30 

42 

41 32 

3 27 

35 

1) 

1 )l(sei\('d 
Kxpecletl 

18 

18,^13 

18 

21 29 

21 

15 52 

40 

1 41 76 

2 53 

.48 


The Kan bht in bna^Ml upon llic diatribiitioii of lengths of mtciva s between 
Ki\en iIiKitH 'V miiipanwin of the mimbn of gaps of specified lengths the 
expected number m eiieli ease is picsented jii Table I The lesults of this test 



90 


H Bura iionrow and n. nNis ^wiin in 


ure also in accord with the aBsiiniption of local riindoiniie'i^ Nulin^ ilu* badly 
biased probabilities of tlie initial set of digits, tlio results of dcia/m- 

strata the effectiveness of the coinpoiincl randoiiiizalioii iiioivM 
The use of tabulating equipment foi pioduciag nindoiii dci'iiiiiil digits by addi¬ 
tion modulo 10 18 relatively fast and simple Tim aiillior*! Iia\ i’ jiHt i'(iin|)lf'tLHl 
production of a set of 105,000 digits in leas than two days' tabula!mg liuu*, 
75,000 caidg, repiesentmg appioxiinatcly 3 months' ria of ii i urront t iirlnad 
waybill study, weio used to genemto the digits, 11 injii-conclalwl rolnnins b< mg 
added simultaneously, A chain of length 10 was iiatHi, iillliDiigh tlir nuliiri* of 
the initial data was such that a shorter ieugtb would iirobalily have im satis- 
factory lesults The derived set is now iccordod im 1500 nirdn, 7() digitn p^r 
card Pieliininaiy tests for local raudoiniicsy coiiliim the rainhmi iiaUiro of the 
generating process Upon complotion of the teals Hub m-L mil be roprinlui td m 
tabular form. 



ll| II B Horton, "Amclhod (or obtauuiig rftiuloiiiuuaiticffl,'' itnrKifa a/ IfniA ,saif V ,.1 
19 (1948), pp 81-85 

[ 21 J E Walsh, "Concorning compound iaiidoiai?nlioii iii ih« Immri iiJ]|iiil» 

liahed nianuacnpL, Project UHD, /Jonj/iH Ufrerfl/t ('n, Mmu, tbilifur 
ma ’ 


[3] E vonMibds, Promis, SlMic and 'I’rnlK Thu Mucnilhm Cu, \pw Vurk. I'll'l 
I |M G ICisDAU, AND B B gMiTii, "EuiitlomiicM Iiiul laiuldin i,i,tti|iNii|{ jiiiiulirnt " 
Pou Slot Soc Pour I Vol 101 (1038), pj) H7-ll((l ' 

[ 6 | M G Kbnmiukd B n SuiTu '‘aocomi piper 

to lio]/ Slat Soo Jour , Vol 0 (1030), pp 6Hi ^ 

^ .s^ J„ur , Veil 

HJi pp 107-172 

[71C W ^“■“jV'OndriiwiitEaiTOloinsaiftpl,,^^^^ 

% Slat Soc Jour, Vol C (1030), pp 02-00. ^ 



ON A MATCHING PROBLEM ARISING IN GENETICS 

By Howard 7ji:vi:ni: 

CoUmhid Ihnver^iihj 

L Summary. V useful for dc'tccCiJiff ilovi.Uions from the Ilouly- 

WoinhcJM: cfimliljiinin ui ptipiiliUioii i^onelics jh tlibtniB^srd Both exact and 
aHym|)lnlii diMtiibuliods jiic Riven nnd a wpccml case wlioie thoie Ja ini'sclasaihca- 
lum IS diMniwu'il Tlie iliaLnimlion oljtaiiifd iilao aiiscs fiom a cciLain caid 
luali'liuiR problem 


2, Introclucliofi. A system of muUiplo ilUgIcb behaves as follows unclei 
Mcndi'liiiii inlivntiiiuo Theio me r distiiicL foima oi alleles, ci, ,ar, of a 
Riven Rcne A Ri\'en individual eouLama two Ronea and can be lepioscnled aa 
a,Ai/ U t ] the nulmdiiul is called u homoKyRolo, if i ; it is called a 
helfrn/A'K<de TJie rcjircHDiilalion a,/a, is called tlie Ronotypo In icpioduction 
ctuli Rimicle piiKluced by an a,/a, individual coiiLiuna one gone winch has a 
probabilily 1/2 of being Ri and 1/2 of being a, In fciLibsiation a paLouml and a 
mali'inal gumoto fuse lo fonri a new nulividiuil wlueh contains Uvo genes, giving 
the well-known ^rentlolum ratios Wc now considcv a livige randflia bmding 
population of mdividunls. Tins will contain 2 A" geiicflj of which the propoi- 
lion f/, will be nf type rti(i == 1, , r, = 1) Tlie piobabihty that a 

] niulom Jiidividiial fi om the next geneuition will be rti/a, la >= 3 ) or 2qiqf{% j), 
wliieli me known as tlie llaidy-Wcinbeig ccimhbnnin piobabilitica The 
Htaliatieal problem aiuso in Lcating (by meiimj of a Bumplc of n indivkluals) the 
liypolliesm tliat l)na ITai dy-AVeinberg ratio liolds ngainat thcaUeinative hypoihe- 
Bid that diHluibing fuicca dcoreaso the number of homozygoLoa, The actual 
data Ims licen discussed elsewlicro (1] 


3, The sample distribution of number of homozygotes. We shall assume 
tluouBliout lliiB paper that A/ is so laigc that random fluctuations m the pop¬ 
ulation proportions from generation to genciation can bo ignoied Let 
J H j ho the mimber of q./a/ individuals in the sample, and let 
Vi « tii 'h bo the numbci of a, genes in the sample, Wc have SSiCt, = n 
and I'j/i 2/i. lH3t h ^x,i be tlio nninbor of liomoiiygofcca, md z = n ^ h 
bo the number of lioLcioiiygoLeH in Uio sample The probability of the obseiwed 
BiimplQ la 



nl 


“ Tt ^ I n ( 5 ! ) II 

IJ l“I !</ 


TT Vi 


01 



92 


JIOWAIlD LLVLNI) 


Since the aie unknown we use the condjliounl inob ilnlitv wlu'ii vi ■ , }jr 

aie held constant Whenever ^\e use tlio wtud ‘ cuinliluniur' lieri'afLei, Hum 
condition will be undoi stood The conrlitioiuil prohnbihU h 


( 2 ) 




nl2* 




wheie 


1 ^ 

/ 


where the summation S'ja over all non-iiogative lutoj^rnl valiici of I)h' t„ sub¬ 
ject to the condition 

%u Hr SfX,/ = u, (t => 1, ■ ‘ , r) 

Consider 


(3) 

(4) 


(s ^i') ^ (£ ^! + 2 2 f'iifY 


IS r 


*ii I j 


where the auimnation S* is ovei all non-ncgalivi'* values of llu' Jty snbjocl 
to the condition = ??, Evidently 1/A'' is llio cdonieiviit uf m (1)| 

but this must equal the coefTicieut of this term in the left iiumiln^r uf (.1), luui 

thus l/K‘ = (2a)l/n;/,'. llcnco the conditioim) probiibiliii of tin* obniMVoi] 
sample is 

'' C2»)l III..) 

For any function w(a:ii, . , , .. wo will now let 7'/(i0 urul 

denote the mxdUional mean and vauance of u for fixed Vi , uiid will ref nr to Oumw 
simply as tfie mean and vaiiance AVo fust obtain the f?th ftu‘toruil moment of 
T^u , that IS ^(^h ), where a, ^ x(x ^ 1) > + j), Chiimidor 

/i.* 

wherea;,i - ic,fc except LhaL = :i;,, -w s, and :i' ImH lliowum' mi'iinnigoH in (2) 
The light member of (C) ,a evaluivLet! ovuKly ms hefure, giving 


(7) 

From this 






expiesaion we obtain 



V SU’U inUillMM 


93 


and 


(<1) a\t ) H- _ r ^ 


wlimo/. - //,/2« iH tlif* ■^lUllI)lp paLiiniilG of Similarly 

( 10 ) 


= + 0(1), 




|rf\ II 1 « 

( 11 ) 


^ (2/0'” 




4(2)1 " 1)2 

Ollun inonn'Uls ran lu' Hiutiltirlv cviilualcd, ui imiticnUii ]?{'t:,,) = i/,i/;/(2n 


-1) 


4 Asymptotic distribution of number of homoaygotes. Fiom (8)^ (0), and 

(11) no niav oii-^dy olilain 

(12) im - lA’(i..) = ((' - 2n)/(ln - 2), 

(13) tf'(/0 - :^tr=(i.,)-h2i-l'a(i,. ,t„) 

KJ 

nlu'ii* (' " and I) = ^Ji/I Tho fmmula (11) la a cloac appio\imaUon lo 
(13) aiul iH t'uMly ruinpnlrd Fioiu (5) hy moans aimiliu to tUobc claasmally 
ihi'd t(i pin\r usinijitnlir nnunaliLy of tlir bmomial distiibution we can prove 
asYinplnlu noiimiliLy of the roiulilioiuil diaU'ilniUou of /t, moro pieeiscly, if 
/!">'» ami yi/n - > constaiiL (i = Ij , r), Ihoti 



6, Effect of misclassification Thcic is u fin Lhin complication in the paiticulai 
CUM' lopojtcd in [Ij, All indiviclmils of gonuLypo a,/at aio coirectly claaaihed, 
but all iiidividiml of Ronolype a./n, (i 5^ j) has a known probability p/2 of being 
rlasHdird «i/n, and an c‘(iiiiil piobaliility of bomg claaBilied aja, As a leaultj 
lln' nbsiM vrd piopniliou of boinoAVgotos iB iv biased catimate of the pvopoition in 
llio pupulatioii fiot /ij , yt denote Lho Iriio Htimplo values, and lot h\ tiJ, , y[ 
denole llie nsmded Haniplo valiiea Tlirti k* = h‘ - e, whole e = (?i - h') 
/V(l ^ P)t will Kivo an uidmiHed estiiniiLe, i e. PJih*) = I>'{}i) In Older to uao h* 
we muBllmve its (oonditumiil) variance Since h* np/{l ^ v) 'h W{1 ^ v), 

= [1/(1 ^ 

Lei h — h' « e, then foi lingo fixed (n — /i), e la appioximately normally dis¬ 
tributed wiLli mean {n — A)ji and viuianco 



94 


llOWVlin UVtNL 


(ii - /i)p(l - p) ■= (« “ ' /’)!* !■ 

Negleclmg tk leraaimlci toim m tins viiuiiufi', < uml k li.m' » jmiil tmiaiid 
distubutioa with paiameteis timt ihp easily ralnilaleil WV ihH'' liavc 

= irl + ir! + MK f)i or 'J' = lo “ " J'l !• n *■• I'W > 

giving 

(1C) IT*. = 4 + [ft /i'(/i)|p/(l - ]>)■ 

In [i] (rj. was given ns + c loi tlio sake uf siniplivity. This wniilil (I'lni lo lie 
smallei thfin (10), hat only negligibly to Sluelly Hjieiiking llw l•ll|(■llll^l|lJ^ of 
E(li) and a* fiom (12) and (U) icquircs n kianilislgc of the line y,, Inil (lui 
nhsetved me unbiased estimates of the p, iiml iheir nac "lintilil I'aiiw) no 
sonous ti cubic 

6, Combinatorjal statement of the problem. Tlii'j (‘jiii ul^u l)o 

c\pifissed as ono of caul mtiLchiufi i\H folta A deck ioniums *Jia I'luds nf r 
diffeient .suits, with y, naids of tlie ith suit (i - 1, , c) We flniw n puirH of 

cards at random without icplaccment, exhauHliUf; the iltui WIml in llie 
(listiibution of h, the uuinbei of Lwiuh (paiiu lu which Imtli meiuberis iin of dm 
same suit) = hi the prolmbility of emtiy h l\\ iim in ^ivcii Ijy (*1), and 
in the limit h is normally distiibuted with menu f^ivea by (12) mid vnriuiieo 
given by (U) The entd matching problem docs iioL iiiMilve llm noLnm of 
conditional piobability By intioducing varmbles equal to one if the titli 
pan 15 a twin and zero otheiwise, the mumcnls of /i emi iduti be oblumeil w ab¬ 
out using genomtmg liincLionSi 


llETlillBNCI'l 

HI TnBODosms Douziunsky IIowabd Uvunk, "Clcnolien of nnUirnI 
XVir Proof of oporaiion of naliiral acluclion lu ^\ihl |io|jiiliaant« of 
Omtm, Vol 33 (10J8), pp 637-317 



A MULTIPLE DECISION PROCEDURE FOR CERTAIN PROBLEMS IN 

THE ANALYSIS OF VARIANCE 

\11I> pAt I.MIN 
r Ui\ Ilf )Vnfihiiujton 

X InlfodiJCliopi, Id (lii^ p.i|H'r \U' \mH dmcu^s u ((‘rtmii Lypo of piobiciu 
wlin’li ^ III iii’td\ a|i|ilH AliiiJH of (lii' iinahHis af vaimiuc Wc ■luppobf' 
tliiil iiT(‘ Riuii K \nrii‘iH\ didl uri' r('(|inif‘(l to iiivosligulG tho difTeicncos 
limn mi tin* of tly violds fiom u pjivpn c\ponn\Gntal 

‘*11' li i‘^ It "f rarulrinny^^l Idin Ls or ii latiii ^rpuuc Tlic classicul 
pnH’fHluri' {!] for tli‘ ilu»n niHi llih imibli’iii luis lirpu to tpit tho null hypothebis 
tliiU till' A* jiro oil f‘/|U,il liy ooiniiniiiif^ tho nilio of tlie mean sum of 

KoIwhii Uf tho n^ulual inoui Mmi of sqimics, and icjccLing 

lln‘ null InpoUif ^vlouuaiT Uii^t u\W\ oMonlod tho i-iilujiil salvie conosponding 

to Ihc h‘\il of MKUihi‘irn -0 ii^‘«l llov\ovor, ilio HtHiirlunl dmiSMiona of this 
lir.H-f^luro H*i‘id to 1 m‘ tjUilo v ngiii- on ilu* (lUt^tioii of whut uoliun ahould ho Uikcn 

iifUT tlic Mill! li\ jintlu h |M*<‘n roji'olisi 

III fi iiiiiJiLi'r of prolih iii'’, Ihn pnii lind uluuhoii winna to ho Mich fclml. instead 
of ilu‘ mill Ihul (lio vuuolios do not <!i!Tcv, ^vlM\t is iDuliy 

nspiiftsl i\ d Mi'ili‘^11*'d rule or “dt-chion fiiufliou'' nliich on tlui Imsis of the 
(jhM*rK'<l vii'lds mil <hi‘*'‘ifv Iho K I'lincUcH itiUj a ''Biiporioi''’ group and an 
'‘mfrrior"' group If Ilu* ^'U])rrior group couhihIh of more than one viuietj^, 
llu' iioNt iipiinipmdo /ntioii mil of cunrw^ th'poiid on the particular problem lit 
liiiud III wuiio KUiiitlioiK (111* Muiclips in the avipofior group might then bo 
mihjcrt to furllior ndis Pori oil llin Imsw of mnne aecomlnry rharactoristiG, or 
luhlUioiiid oliMTVahoiii might he Inkcti to rlisi'i'iminate hotucen llie members 
of the wi|>fTmr Kroiiji, riflor dnn'ardiug llio vanelieH in llio inforiur group IIou - 
ever, if (til \jini*Ui^H hnpficn to he rlasflificM^l in one group, tho group will bo 
hdirlUsl **mailrid" luul tlii^ rc^^ulL m to bo interpreted as implying that the 
varietur arc* liomugf'iie<iim 

In ihiH fonmiluliioii, Hie prolilom la now of n nuiltiplo decision typo, it is 
nm'iwjiri lo delude on Ihn IfiiHifi iT u Bample which one out of the 2^ - 1 possible 
decisions (or chiWilu-nliojiH) to ficloct \Vc will suggest a solution which seems 
fpuU- uu M\ mlmtivo haiUH, hut it m still an open ciucstiou whether 

IhiH Kohilioii mmt oplirnum uric 

2^ A apednl c««e. In thm la^clion wc will (Iihcubs tlio problem under tho 
awniniplujii Hmi llio Minmirc « of a single oliiiorvaliou m known o priori This 
IS u raduT rt'**<trii'livo iiieaimpLuin, hut it can. ha considered as appioximalcly 
BnlinruHl when the niiriiher of degrees of freedom available foi oatimating tho 
\anaiK'e ik huge, which vnll often he the ca«<*. Tlio minor modifications neces¬ 
sary Ui muire exact iwilts for the amall sanijilc case when c is unknown are 

D5 



96 


EDWARD PAULSON 


discussed in section 3 We also assume that the experimental design has Ijceii so 
selected that there will be the same number (r) of observations on each of the K 
varieties 

Now let = the ath observation on thefth variety (f = 1, 2, • • , /C; a = 
1, 2, • • , r), let S. = S „Li Xta/r, put m, = E{v,) where li stands for expected 
value, and take X to be a given positive constant. The conventional assumiition 
is made that all the observations are normally and independently di.strilmte<l 
with the same variance c^^ Denote by xu the maximum of the K mean vahuM 
&i, ^2 , ■ • ,Xk . The rule for dividmg the varieties into superior end inferior 
groups is the following: the superior group is to consist of all varietie.s v,ho>>e cor¬ 
responding mean values fall in the interval [S,/ — Xir/\/r, i'^r] and the ri'inaining 
varieties constitute the inferior group. (As mentioned earlier, if all the vaiielies 
fall into one group, this group is labelled ‘neutral’ and the varieties are con.sidered 
homogeneous.) 

This rule completely determines the classification as soon as X is determined, 
For a given sample size, we might select X by con.sidcring the relative importance 
of different types of incorrect classifications. If II denotes the erroi of tnis- 
classifymg the varieties when in fact they are all equal, and G clenote.s tin* (‘iTor of 
misclassifying the varieties when they actually are unequal, then it i.s obviou.s 
that the greater the value of X, the smaller the probability of an error of type II, 
but the greater the probability of an error of type G, Thcreforo for a given 
value of r it is necessary to adopt some .sort of coinjiromisc in selccliiig X. 

For a given value of X we ivill now derive c.xplieit forrnula.H for P(II), the 
probability of not classifying all the varieties in one group when vii -nii -.i 

mx , and for P(Gi) the probability that as a result of the experiment lliere will 
not be a superior group consisting only of the Zfth variety when nii « m. = •.. = 
mx-i = m and mx= m + A(A > 0). Gi was selected becau.se it ujipeared to la¬ 
the particular kind of type G error most likely to be uw-ful in aiqilicatioiis. 
Also P(Gi) may be regarded as the least upper bound of the probability of 
misclassifying the varieties Avhen one variety is superior to any of thr* ot'her.s 
by an amount at least equal to A. Noav if avc denote by W = {xv 
the difference betAveen the maximum and minimum values of the .set iV,’| 
(1-1,2, ■ ■ ,K), then it is obvious that 


( 2 . 1 ) 


1 - P(H) = p|tF < 


equivalent to the probability that the rang.- of a 
'^dependent observations from a normal distrilmtiou willi unit 

and Hartlev 121 tabulated l»y Pearson 

y [2]. From these tables it is a routine mailer to find Pdl) cor¬ 
responding to a given value of X, and conversely. To evaluate P{G,), we have 


1 - P{Gi) = pixi < Xx — 


^ foreachz (i = 1, 2, ■ ■ ■ , K - l)j. 






97 


By I'vuluutinn the piohaliilKy .,f ilii.s ovont for li fixed value of and then 
inlcjrratmu; nut \Mtli if>i)ect tu , it is a simple matter to verify that 


( 2 . 2 ) 


/^TT J ji Lv’2Tr L 


* 1 /} (Mir)-\/r—\ 


dt 


dy. 


In .some appliridintis, it may lie dcsirahle to have an explieit expression for the 
Iiroliuhilily that flit‘ sUiieriur jurmip will eon.sisl of the A'th variety and not more 
than .S’ inferior varieties when im -= m< -- - ■ • = m,K i = m and Mr = m + A. 
If we denote this prohaliilily hy 1 - F* it i.s not dillieult to show that 

1 — ~ 23 i ' 'j ri’io + ctTjol, where 

a~ti \ a J 


r* .. 'v’i!' r 1 

J.T. V27r J~, 


v’2j 


^ Aif-* (A/(r)\/r—X 


X—o—1 


(2,:i) 


][ ..i/l (A/fflx/r 


T-1- [' 

_'s/27r Jv 




dt 


dy, and 


7'- 


V 


1 r*' . « ~ r 1 fV \ “)K-a-l 

r , ,, o»m 

2w K Lv ^TT 


UJ.-.. 


" •’*- -i- f . dl dy. 

J L V *'l/-fA/ir)\/r—X J 


3. General case. We now briefly di.seiiss llie exael treatment of the pioblein 
when <f i.s unknown. The nutation of seetiou 2 will be, used, liut in addition 
denote by .s^ an eHtirnale of re.sulting from the given experimental de,sign 
whieh i.s basis! oti the re.sidual .sum of .squares with a degree,s of freedom. It is 
well known that / i.s independent of tlie set [.r, 1 (i = 1, 2, ■ • • A'). Now the 
rule to lie used in elussifying the varieties inlo two groups is as follows: the 
superior grouii i.s tu eonsLst of all tho.se vnrietie.s whose mean values fall in the 
interval [.?« -- -y/r, .r>!> and the inferior group consists of the remaining 

varieties. 

We now liiitl that: 


Gi.i) 


1 - /’(//) = F{W < X.Wrl. 


I'he right hand side, of (3.1) depends only on the distribution of the 'sLudentized’ 
range and has al.so been talinlated by I’eanson and Hartley [3] although the 
tabulsUiou is eonsiderably les.s eomplete tlmu that of the, range in [2], It is also 
easy to verify that the e.\pre,s.Hion for /•’(G'l) now becomes 


F{(ix) « 1 


(3.2) 


in/2 /*<« r®® 

/ \ f f 

I 20871 I Jo 


V2ir2‘'' 


’(nwM i/*)/2 




.vb L 


«V2) 


dt 


K~l 


dy dw 


with a similar modification for Pt • 



98 


EDWARD PAULSON 


4. Remarks. Any application of the ideas suggested here would he greatly 
facilitated if tables of P(Cri) were made available. If this were done, it would be 
possible to decide in advance of an experiment how large r should lie iu order 
to have a fixed control over both types H and (?i errors. It is oliviou-s tliat 
further research both along theoretical and applied bne.s is needed. In conclu¬ 
sion, the writer would like to thank Professor Albert Bowker for sev'eral helpful 
suggestions. 


REFERENCES 

[1] R. A Fisheh, Slaiisiical Methods for Research Workers, Chapters 7, 8. 

[2] E. S Pearson and H. 0 Hartley, “Tables of the probability integral of the range in 

samples from a normal population," Biomelrika, Vol 32 (1941-^2), pp .301-310. 

[3] E. S Pearson and H, 0 Hartley, "Tables of the probability integral of the student- 

ized range,” Biomeirtka, Vol. 33 (1943), pp. 89-99. 



A MODIFIED EXTREME VALUE PROBLEM 
Hkxj\mix Ep.stei.'j' 

Cun! Rm'anh Lalmralory, Carnrgic hisLiliilc oj Technology 

1. Introduction and summary. {'Dnsidcr the following problem. 

Particlc.H !iro (li‘*tnltn((‘fl over unit arcaH m auch a way that the number of 

])ar'ti('l(‘.M to bo ft/Uiid in .siich arr-a.a i.s a ruiuloin variable following the law of 
Poi-Hon, with V (Mitial to the l■.\I)t‘(■l(■(l number of particles per unit area. Further¬ 
more. the partieles theiuMdves aip a.«smne(l to vary in magnitude according 
to a size di^rihution speeiiied (independently of the particular unit area chosen) 
by a d.f. defined over some interval a < x < h, yith F(a) = 0 and 
F(b) 1. The probh'tn i.s to find the distiibution of the smallest, largest, or 
more gf'iau’ally (he nth .Mtiallcst or nth largest particle tn randomly chosen 
imil area.s. 

The problent as .stute'd is not etimpletely specified, To specify the distribution 
of .smalle.st or large.st parlicle.s in a unit area one mu.st give a rule for dealing with 
thos(‘ at (“as which contain no partiele.s at all. More gencially, in the case of the 
distrihulion of the nth snudlest or nth largest particle, one must give a rule for 
dealing with those areas which conhiin (a — 1) or fewer particles There are at 
least two poaulhle tdlernatives. One alternative is to omit none of the areas 
from consideration liy setting up the following rule: if no particles are found in a 
given unit ar(‘a then this area will be considered as one for which the smallest size 
particle is x » 6 and for which the largest size particle is a; « o. More generally, 
if (n — 1) or fe.wer partieles are fottnd in a given unit area then this area will be 
considered iw otu* for which the nth smallest size particle is x = b and for which 
the mil largest size particle is x » a. A second alternative is to restrict attention 
tj) those areas \shieh contain at least one particle (in the case of the distribution 
of smallest or largest values) or at least n particles (in the case of the distribution 
of the. nth smallest or nth largest particle). In other words, this means finding 
the relevant conditional distribution. 

From the point of view of the application of the theory of extreme values to 
fracture problems, there arc some situations where the first model and other 
situations whore the, seeemd model is the more appropriate in describing the 
phenomenon under investigation. In this paper section 2 will be devoted to a 
derivation of the distributions associated with the first alternative; in section 3 
the conditional distrllnUiona will be described briefly. 

2, The distributions under the first alternative. In this section we shall 
he ooneernerl witli the firat alternative. To find the distribution of the nth 
smalle.st particle in unit areas, we first observe (the verification is left to the 


‘ Present address, Department of Mathematics, Wayne University, Detroit, Michigan. 



100 


BENJAMIN EPSTEIN 


reader) that under the hypotheses of section 1, the nuinbei of particles having 
size in a unit area is distributed according to the law of Poisson, with 
expected number equal to vF(x). Next we note that the probability that the 
nth smallest particle in a unit area exceeds x in size is equal to the piohabilitj’ of 
finding exactly 0, or exactly 1, oi exactly 2, • • , or exactly (a - 1) piu tides of 
size <x in that area Therefore G„(x), the probability that the nth smallest size 
particle in a unit area is < x, is given by 


( 1 ) 


Gu{x) = 1 


E 

i-a 


-,i>M {vFix)y 
^ i\ 


X < b] 


= 1, X > h, 

where we have assigned to the size x = b the probability /j) ! which is 

just equal to the probability of finding fewer than ?i jiarticles in a unit area. 

If the d.f. F{x) has a derivative /(x) for all x lying in a < .r < fi, then G^ix) 
has a derivative for any value ot x 9 ^ b. Therefore the probability demsity for 
the nth smallest size particle is, for any x 9 ^ h, given by the function gn(x) where 

(2) v/W, c,<x<b-, 


= 0 , X < a, X > b. 

n—1 j 

A finite probability X) e"' is assigned to a: = &■ 

1-0 3' 

If one makes the transformation y = vF{x) (for a similar triiu.sformation in 
extreme value theory see [1, page 371]), then (1), and (2) become 


(10 

olfe) - 1 - 

si 



II 

1—*• 

IV 


and 



(20 

+/ N e-y-^ 

0 < y < v; 


= U, 


y < 0, V > V. 


n—1 j 

A finite probability e"' ^ is assigned to y = v. 

The distribution of the smallest size particle in a randomly chosen area is 
tound by letting n = 1 in equation I. 

In a similar way one can find the distribution of the nth largest liurtlcle in a 

narSl^fnfr^ probability that the nth largest size 

particle m a unit area is ^ x, is given by 



A MdniKIt’A) BXTHKME VATjUK PHOULKM 


101 


( 3 ) 


II Jx) = 0, X < 


a: 


^ y -a-«x)) [p(l - F(,X))Y 
-'f “ ’ 


X > a, 


where we have uK^i^necl to th(! size x = a the probability 

j-o ;! 

If, a.s lieforr-, F(x) is a.sMirned to have a derivative/(x) for all x lying in 
(I £■ J" ?(, th('n lh(' piubaliility density for the, nth largest size particle is, for 
any x ^ a, given by the function hjx) where 


( 4 ) 




0 , 


(h - 1)! 

X < a, X > 6. 


r/W, 


a < X <1}\ 




A linite probability t. ' jj is assigned to x =■ o. 

If one makes the transformation z = v{l — Fix)], then (3) and (4) become 


(3') 


Jl%) = 1 - X e- 
>~o 


z < V] 


and 

(40 


= 1, 2 > r, 

/iJCa) 0<z<v; 

“= 0, z < 0, z > c, 


with a finite probability 2^ assigned to z « r. 

j~0 Jl 

The distribution of the largest size particle in a randomly chosen unit area is 
found by letting n = 1 in equation 3. 


3. Conditional distributions of the extreme values. The appropriate con¬ 
ditional distributions for the problem under consideration can be written down 
readily. The .step function component which occurred in section 2 is no longer 
pre.sent since w(‘ re.striet our attention only to those areas which contain at least 
n particles (In the general ease, of the distribution of nth smallest or nth largest 
.size partieles). 

GlU), the tlf. of the nth smallest particle in a unit area chosen at random 
from the elass of areau containing at least n particles, is given by 

CF„ix) » 0, X < o; 

(h) ---1 a < X < h\ 

1 “ Yj V/i I 
)-0 

X > b. 


1 , 



102 


BENJAMIN EPSTEIN 


Similarly Hn{x), the d f. of the nth largest particle m a unit area chox'u at random 
from the class of areas containing at least n particles, is given by 

Hl{x) = 0 , X < a, 

g _ F{x))]’/ji - £ e-V7;: 

(6) = -, a < j < h] 

1 - Ze'V/7' 

,-o 

= 1, X > b. 


4. General remarks and an application. It is interesting to note that the* 
assumptions of section 1 lead to distribution functions in section 2 which are 
precisely the same as the asymptotic distributions of sraallc-st, largo.st, nr ntli 
smallest, or nth largest values in samples of fixed size N{N —* ») (sec eg. 
[1, p. 371]). In the problem treated in this paper, p, the expected nurnbt'r of 
particles in a unit area, plays the role of iV in the fixed sample size ciise, with the 
important difference that the distributions in tlie present pajicr are t'xact ami 
not merely asymptotic. 


The results of this paper have a direct bearing nn certain asjiiaits of fracturo 
problems [2] and in particular on the dielectric breakdown of capacitor.^ (HI. 
In the latter problem there appears to be ample juslificaticm for assuming tliat 
the breakdown voltage is influenced to a considerable degree by tliti prc.scncc of 
flaws known in the technical literature as conducting purticlc.s. Th(‘.se pai t iclcs 
are spread individually and collectively at random througliout tlic iu'im, of t hr* 
capacitor and, depending on their size, create a local weirkening of the capacitor 
by reducing the nominal insulation thickness in the neigliborluiod of flaw.-.. 
The voltage required to break down the capacitor is equal to that requiw^d lo 
break it down at that spot where the greatest penetration ha.s taken placo. 

In the dielectric problem the statistical distribution of largest vahivs ap¬ 
propriate to the problem is given by (3) with n = 1, and the size diatribiiliun of 
conducting particles follows a law of the form/(a;) = a: > 0. This is a 
situation where all the capacitors under test are part of the sample (since* all 
must be tested to destruction) and those which happen to contain no dcfw'laCan 
event with probability e~^) act as if the largest particle size is equal to a - 0 
e simply represente the expected fraction of capacitors which have strength 
equal to the theoretical strength of the insulation. 

fniw ««°tion 3 would bo more appropriate in the 

wing sort of practical situation. Suppose that surface flaws spread at 
andom on glass rods are known to reduce greatly the strength of the rods 

jpection those specimens which have no flaws. Then the strength distribution 
of the remammg specimens is a conditional distribution since each specimen must 
contam at least one flaw to be eligible as a member of the sample 



A Moinni'.i* KXTUKMi; VAr,r-L riiOBiJAi 


103 


UIJ’KHEXr'KS 

(1| H. riiAM^-n, Mnihffiaiical Mrlhmh of .S7n<<s//cf, Prinoetoii University Press, 1940. 

('ij H ErhTEix, "Slttlislical ri.'-pci'ts of frnrlure prnbleius,” J, Applied Phijs,, Vol, 19 
(.HHHi, jip 140-117 

(3j U Ui’.sTEiN anii U. HwKiii'j, “The theory of oxircmc valujb uiid il.s imphcationa in the 
(Study of till’ du'lei’tru’ slr<>H(;lh of pnpiT capacitors,'' J Applied Phyeica, Vol. 19 
(Itl'ISI, pp 511 .'WiO. 



ON DISTINCT HYPOTHESES 


B"? Agnes Berger and Abraham Wald 
Columbia University 


1. Introduction. The following problem was suggested to one of the utdliors 
by Professor Neyman: 

Let X = {Xi, Xi, • • • , Xn) be a chance vector and let h denote any .‘iiinplt; 
hypothesis specifying its distribution Let H, be the c’orapoHite hyiiofhesirt 
that some element h of a set of simple hypotheses j/f) i, (f = 0 , 1 ), is true, and 
assume that He and Hi are known to be exliaustive. Let hi denote an el emeu t of 
(i = 0 , 1 ). 

For any region W of the sample space S, let F(1F | h) be the probability that 
the, sample point falls in W when h is true. 

We shall call Ha and Hi distinct, if a region IF exists for which 


P(F 1 h) ^ P{W I hi), 


for all ha c [/ijo 
and all In e (/i)i. 


The problem is to establish necessary and sufficient conditions for two coinprjsite 
hypotheses Ha and Hi to be distinct. 

For any critical region W for testing Ha against Ih , let 7 (IF | h) bts 1 he pn illa¬ 
bility of a wrong decision when h is true, i.o. 


y{W\h) = 


iPW I h) 

[1 -P(lF|/0 


Suppose now that Ha and Hi are not distinct, 
exist such that 


for h (Ho 
for h t Hi. 

Then to any IF a pair h^ , /i( 


Piw I ha) = P(IF 1 hi), 

thus 

y{W I hi) = 1 - y(TF I hi), 

and therefore 

^ U'b yiW (^) > ^ for any IF. 

I following lemma: 

simple hvpothcsielhTp{^^^^^l^)^^^^^ andklH, ^ k, bt </w 

104 



iix Di.vrixcr p()’iiii.,M:n 


105 


mlii^fying /Ur) ^ }h(.n hnn n numur,. Thm there exists a region T7 

siirh that- 1 0^' I P%) < i, * — 0. 1. 

FitooF: Let J4 be .Ictu.e.l l,y --= p ,, R, by p„ < p, _ 

^ 7 a.r) 1 and > 0. ft - (). 1). R, and R, are of positive raeie 

Let 


fasLire. 


Tiien 


a) 


Pi in Hi 

^fxj =» •' pii in Ri 

I 

,Pi = Jk ill Ri. 

( (i>(x\ (lx > 1 and eirhiT 
Jfl! 


j }h (lx > \ or b) f po clx > \ 


or iutlb. Ar'^nnH' first ai. 

Ix'l Hi C Ri -b /f« atnl Midi that f pi rlx = J, but f pa dx < L This 

can lie done ity iiK’luditiK into /f, a part of Ri of non-zero measure. Let Ri C Ri 

+ Ru Ri anrl siidi that Q < f pi ,lx < \ - f p„ dx. Then 

■fS) Jut 

I ?Ldx / pidx ■ :] - p.,eix,thus / podx < ^ but f pidx>h 

Jut JfCi >'Rl+«« 

AaMiiuc now li), 

Ixit Ri C Ri and wu'h that [ /ad dx == L Then f Pi dx < 4, 

■f«» Jflj 

Let Ri <Z Hi Ri and suidi that 0 < / jk dx < ^ — f pi dx. Then 

Jfli 

/ (lx > j and / pt dx < J. 

Tims in caws a) 17 « Rt -f Ri, anti in case b) 17 = S — Ri ~ Re is a critical 
region for wliidj >(17 • p,) (i 0, 1). This proves the lemma. 


3, The main theorem. As'nmu* nmv A' to liavc a density function p(x, | 
where 0 - (Pi , Pt, • ■ • , ffii in an unknown parameter point. Let wo and wi 
be two disjuitsl, bovmdwi and dostni sulraels of tlie fe-dimensional 0 — space. 

si e- «9 -7 «i tuui sufiiHiM- lltat 0 is known to belong to Q., which therefore 
will bu tallied the parameter .space, lyct //< bo the hypothesis that the true 
parameter iMiml is «m dement of w,, (t ^ 0, 1), 

We dudl eunsiider the jirtiltleni of testing Ilg against JRi . Clearly, P(f7 1 h) 
can now he ivritten as P(T7 j 0j and 7 (ff’ 1 h) as -yCiF j 0). 

We shall make the following assumptions concerning p{z 1 6): 



106 


AGNES BERGER AND ABR.VHAM -VVALD 


Assumption 1. piz | 6) is continuous in 6. This is of course ulwuys fulfilled 
if SI consists only of a finite number of points. 

Assumption 2. For any bounded domain M of the sample space u'c have 

I [Max p(a;) 0 )] </a: < «>. 

J M e 

It follows from Assumptions 1. and 2 . that 
(3.1) lim f p{x ) tf) dx = 0 

r—ao 

uniformly m 8 where S, is the sphere in the sample space with center at the 
origin and radius r. 

In what follows, whenever we shall speak of cumulative distribution function 
g{8) in the A-dimensional parameter space, we shall always mean a cumulative 
distribution function satisfying the condition 

[ dg (8) = 1. 

For any c.d.f. g(8) let W„ denote a critical region which contains any Bamplc 
point z satisfying the inequality 

/ dg(8) > [ p(x\e)dg(d), 

■'“1 Vuj 

and does not contain a sample pomt x for which 

j^^p(x\e)dg(d) < £ p(x\9)dg(,6f 
It can easily be verified that Wc minimizes the average risk 
(3.2) y{W 1 8) dg{e), i.e,, W, 1 8) dg{e) = Min 1 8) dg(g). 

Let £ 2 , {i _ 0, 1 ) be the class of all density functions p(z) “ f p(x I fll cla (0) 

where ^.(e) is subject to the condition 


/ dhi(8} 


1 . 


0% « rw ^ hold. 

We 4.11 prove thfXi I ») roKp« 



n\- DIsriNTT nVPfJTHi;SEfi 


107 


IMuiitr. Supiiii^i' iliaf il, and Oi arr* not disjoint. Then there exist two 
di.'Jtnlnition IiiTK'fion- ijJi)* and gi(e) .Mich that 

[ <kln{0) -■ f d{/i(0) = 1 

J«Q Jui 

and 

f }i(x Jl) rhja(0) - f p(x\0) dgi(e) 

(except peihape for points / in a .srd of measure 0). 

Let, <j{S^ -- i t- i Old). Clearly, 7(11'') > [ 7(17 | 0) dgid) = ^ for 

Jn 

any H‘. *l'his ptoves the nenes.sify of oiir condition. 

IVe shall now as''Ume that, fL and 5ii are di.sjoint. First we shall show that the 
refills of jlj can la- aiiplifsl, On pa^es 21)7-8 of [1) there are .seven conditions 
listial for the setinential cave. For the non-.scquential case (the one considered 
here) the eonditioiih ti and 7 drop out and the first (ivc conditions can be reduced 
to the follownin eoiidilion.',: 

Conililwu I: Tlix ict ighi fundimi Wifl, d) is bounded 

ConditwH 2' For uny 0. the chance, vector X admits a density function p{x | 0). 
Coiuliliim For any nqumcr !0,| (f = 1, 2, ■ • • , ad inf.) there exists a mb- 

sequrnre ! fl,i 0 '' L ‘ * .• « parameter point Oo such that 

lim piz 1 0,,) => p{x 1 flo) 

Condition 4- 7/ {0* t (f ** L 2, • • •) is a sequence of points and da a point such that 

lim pix 5 0,') »» p(x 1 0o) 

then, 

lim 17(0, , d) “ 17(00, d) 

uniformly in d. 

Condition d; The mime as our /Assumption 2. 

In our problem dttlie dwLsion of the statistician) can take only two values; 
aceeptanei) or rejertion of //a. Condition 1 is evidently fulhlled, since 17(0, d) = 0 
it a und 1 if a wroug dccisioti is made. Clearly, 

Cumlitkms 2 fi are. alwt fulfilUsl iu our problem. _ , , , . . 

A cliatrihution 0{0} is «atd tir lie least favorable, if it maximizes the mmunum 

avernKe risk, i.e., if it maximizes 7(17 ! 0) dgW with respect to g. 

It follows from Thwrems 4.1 and 4.4 of [1) that there exists a least favorable 

dislrilnilioii. , , , . 

U*t g*i0} be; a least favorable distribution. Then, as has been shown in ilj 

there exials a ir„, such that 



108 


AGNES BERGER AND ABIIAIIAM XVALD 


(3.3) Max yiWo* \e) = I yiW,. | 9) d^*{9). 

6 

Thus, our theorem is proved if we can show that 

(3.4) [ y(Wo.\e)dg*(0)<i^ 

•In 

Let H* be the hypothesis that the true density is given by 

[ p(x I e) dg*(9) 

Poiic) = -, 

/ dg*ie) 

and Hi the hypothesis that the true density is given by 

[ p(x 16 ) dg*(d) 

Pii^) = -— . 

/ dg*i9) 

J Ul 

Since Qo and are disjomt, po(x) and pi(x) are difTerent denaity fiim’tion.H. 
Hence, according to Lemma 2.1, there exists a critical region W* fur tvHting //* 
such that a* < j and j3* < where a* is the prolRibility of typo I error, and 
/3* is the probability of type II error, Clearly, 

(3.5) 5 > «*/ (^9*(9) + P* [ dg*i9) = f yiW*lO) 

V«0 Jyj Jq 

Hence, our theorem is proved 

It follows from (1.1) that if Ho and Hi are not distinct, fib and flj am not 
disjomt. 

On the other hand, suppose that fio and fii are not disjoint and lot 
Then for every W 


f P{W\ 9) dgo{9) = f PiW\9)dg,iO). 


"fr -Tl^BUcUtat •“ 


P(w\9o(W))^f P{w\e)dgo(e) 


and 



-'S If.* 1 UYi'iniii,.-!.-. 


109 



" ^5 -. / /'. ir j f/|^,r5) 

* 1 

for i-vcry H". 

Ilr.'O .-/'j... i.S 


*< ir , 

for (‘Vrr.v H’. 

IliU-i. or {}(>'inilnwitui'tlirorcia: 


Tiii'tKKM A 'i. //<•< ! / ,1 /, ()_ under ilie assumptions 

iij Ihi/rnm . 11 . n )■ .’tntu* rondtiw/l Jor //o Bi to be distinct 

(lu)i ih*^ ni,‘j \i', ’• 

l-i ! ! in M K 

(1| A. A!!., “I "t j)> SaIj. ! ts J *'•' ' * .1 ' "rv Ilf fp'juHiltal decision functions,” Econo- 

1, A!' ;-'i" 3, j 'll 



AN APPROXIMATION TO THE SAMPLING VARIANCE OF AN ESTI¬ 
MATED MAXIMUM VALUE OF GIVEN FREQUENCY BASED ON FIT 
OF DOUBLY EXPONENTIAL DISTRIBUTION OF MAXIMUM VALUES’ 

By Bradford F. Kimdald 
N, Y. Stale Deparlment of Public Servics 

1. Introduction. Given the doubly exponential distribution of niH%iinum 
values 

(1) F(x) = exp (-fl"*'), y ^ a(x - u), 

where a and w are unkno-^vn parameters, wth » prescribed frequency F® the 
“reduced variate” y is fixed, say at 1 / = i/o. Thus with 

Fo = .99. ye ^ 4.60015 • ■ • . 

Given a sample of n maximum values X {, we are interested in the munpUtift 
variance of 

(2) X = gill, a) - ^ + 2/o/« 

due to sampling variations of the estimates & and u. 

H Fairfield Smith has recently pointed out to me that the examples of upi dila¬ 
tions of sufficient statistical estimation functions to this problem Riven in a 
previous paper (see [1, pp. 307-309]) give too large a rang!' for f y(w. «) 
because the sample points (li, a) within the confidence region of the conslant 
probability ellipse apply to optimum estimates of (d, a) rather than to that f>f 
g - gi(d, a), "^at the problem calls for is the determination of the rxwil iotia of 
curves g{u, a) and g{u, a) such that the integral of the pdf of the <*atiuuition 
functions over all sample values (H, a) which lie between Iheae two curves is 
equal to the confidence level (taken as .95 in previous paper). Further con¬ 
siderations of this being the shortest interval g ~ g, also come into play. 

As so often happens in research, the previous analysis, although not giving the 
final answer, suggests the next step. If w'e change our parameters to 

(9) 9 = g(u, a) = u + yo/a, a' » « 

and are able to carry through the inverse of the moxirnura likelihood solution 
for fitting of (1) to 11 sample values Xi , then we shall be in a positiau to find the 
asymptotic marginal distribution of •y/n(g — p), which will give the answer to our 
problem (see [2]). 

The Jacobian of this transformation of parameters is 

„ 1 Vo/a'^ 

6(U, a)/6(9, a') = « 1 

0 1 

and hence for a' > 0 no new singularities are introduced. 

* This involves a correction of a previous paper [1]. 

110 



AN API’HONIMATION TO A SAMPLING VARIANCE 


111 


2. The equations of the maximum likelihood solution. For a sample of 
size ti, thf“ jnif tjf tlic sampling distribution in terms of the old parameters is 
given by 

F[«, a, = a" exp [-exp [-2a(a:; - u)l, 

and 

log P = 71 log a — a2x, + nau; 

« nllog a “ c"“(2c'’"*‘/w) — ai: + cm]. 

Now I'lnmge to the new parameters and use the substitutions: 

2 . « e""'* , 2 = ( 22 .)/n, 2o = e"““ = 


Thus 


dzu/dg = — a'zo, 52o/3a' = —gza, 
and flenoting log P by L we WTite 

L = n[log a' — B/zo — oc'x -f cx'g - ?/o]. 


lienee 

(4) Lg * -na'[2/2o ~ I]) 

(5) La' “ nfl/a' - 5(f/zb)/da' - i + p]. 


3, Derivation of expected values needed. Recall that 
5/2« » 

Hence 

(G) a( 2 /Vo)/do(' « 

5(f/zo)/5a = — 2(a:< — u)e 


(7) a®( 2 / 2 o)/da'* = e"‘''2(x. - gfc 

= 2(a:. - u)V"‘*'-“V«. 

By investigation of the generating function 

(7(0 “ Fl21(«.-/2b)'‘ ‘], 2f = e'"S 

it can be hIiowu that 


•<>/„] = 1, 

ii;{2(s:, - « ~(l/a)r'(2) - -(l/a)(l - o\ 

where C denotes Euler’s constant, ,57/216 • * • , and 

^[2(1, - = (l/«)r"(2) = (lA ')(//6 + - 2C). 



112 


BEADFOHD F. KIMBALL 


Hence to find expected yalues of (6) and (7) wo note that 
-e'''”S(cc. - - -2(x. - 

= -S(rc. - '-’/a, 

and therefore 


(8) EW/z,)/bci'] = E\d[z/z,)/d<x] + (yo/a) E{ 2 jzo), 

Similar analysis shows that 

(9) ^[a*(z/2o)/5«'*] = E[a\z/z,)/dot] + {2y^/«)E[diz/zo)/aa\ + (y^/a^)/i:(f/z,). 


4. The inverse of the maximum likelihood solution. It will fir.nt lie noted 
that the maximum likelihood equations (4) and (5) for determining best esluiuites 
of g and a' become identical to those for determining best estiinate.s of old 
parameters u and a, when the transformation of parameter.^ (3) is iqipliiHl to 
them. This is easily verified by applying relations developed above.' 

This means that t/ie test estimates g and a' oilaincdfrom (4) and (5) arc. relatal 
to the best estimates of old parameters 4 and a hy 

^ 4 -f- yo/tt) a' =s d, 

We now proceed to set up the inverse of the maximum likelihood aolution, 
In order to do this we first need the variance-covarianCG matrix of the thrift 
solution. This is (see [2]) 



E[-L,a'] 


E[-La'a'] 


Now 


= -■na'\l/z,), E{~L„] = na'’, 

L,.' - -n[zA„ ^ 1 + a!'a(2/zo)/3„'], ElL,a‘] = n(l - 0 + j/„). 

La'a' = —n[l/a'“ — d^{z/zf)/da'% 

E[~La'a<] = (»/«'')[ttVc + (1 - c 4- (/o;rj. 

Thus the varW-covanance matrix of the estimation functions (4) and (5) is 

n(l - C 4 - yo) 

||n(l - C 4- yo) (a/a'*)(xV6 4- (1 - C + y^)^] , 

The asymptotic form of the inverse solution for Vn {g ~~ g) and Vn l,V - 1 

if'”" 

read -h^z^y^. +&(zA,)/5at in second equation of (6.2) ahonld 



A\ .Vl'I'lUlXntVTION TO A S 1.Mi’LlXO VVIlIVN'CE 


113 


T n - r + j/of/^^rVo)! -fi - c + yo)/(^Vo) 

-fi--r + .v^v^rVo) aW/o) 

'riiis giv('« the Mtluiioii sought. From the general theory of the maximum 
hkclilinotl sululiou fsco 12)) tin* di.itrihution of [■\/n(,g — g), -s/n^a' — a')] is 
usyniptntic.'illy norinal. lienee ihc marginal disiribntion of -s/nig — g) mil he 
iwjmpUilimlhi normal, and for finite n, the standard deviation inay'he approximated 
hp 

(12i ai§ - ff) = [i/(vTi«')l\/iT{i - C +"2/o)VUV6). 

Now the I'orrelation coofficient for the asjunptotic bivariate normal distribution 
is «w'n to he 

r ^ —■(! — C' + yo)/'\/iry6 + (1 -- C? + 2/o)“- 

If a' rvere knt)\vn, mti shoultl have the standard deviation of s/nig — g) reduced 
by factor Vl "r*- This i.H found to be equal to the reciprocal of the second 
factor in thii (‘({nation (12). Hence we conclude that if a' be Imorni, the standard 
dmalian of (§ - g), far finite n, is given approximately by 

(13) aid - ?) = 1/(V’W). 


6. An example. losing sanu? example outlined in previous paper (see [1, 
PI). (If)" 3(Wlh we have n - 57, a = .01921, 1 -- O' = .422784, t/o « 4.60015. 
This gives a » 27.32(1. For 95% confidence interval wo take (1.96)(r = 54.54, 
and with il ^ IKO.O, 

p = tl + yt/a •= 419.7, 
and the intiTVal i.s approximated by 

jp - {?! <54.5, 

which as an appro.ximation gives the symmetrical interval 

305.2 <g < 474.2. 

McIIukI 4 used in fireviou.s paper gave the longer interval (see Introduction) 
which was not .symmetrical aljout p; 

302.8 <g < 507.4. 


HKl'KKKNOMR 

(11 1). F. Kimihi.l, "SiilHcicnt htiidHtiral (‘HliiTiinioii fum'lioiia lor the paramoters of the 
liiatrilmlioii of mii.xinnim vuIuck," Annals of Math. Mai., Vol. 17 (1946), pp. 

(21 B. B. Wmsa, Malhmalml Mamies, Princeton Univ. PrcBs, 1943, p. 139. 



NOTES 

This section is devoted to irief research and expository articles and oHvr short items. 


TESTS OF INDEPENDENCE IN CONTINGENCY TABLES 
AS UNCONDITIONAL TESTS 

By a. M. Mood 
Jowa State College^ 

Summaiy ail'd introduction. Since the ordinary tests for iiulojK-ndi'iifc in 
contingency tables use test criteria whose distributions depend on nnknuivn 
parameters, the juatifleation for the tests is usually made eitlier by an appeal to 
asymptotic theory or by interpreting the tests as conditional tests. The latter 
approach employs the conditional distribution of the cell froquencie.s given flu* 
marginal totals, and was first described by Fisher [1]. The purposf^ of the 
present note is to show hoiT these testa may be regarded as uiuamditional tests 
even though the parameters are unknown by augmenting the test critenoii to 
include estimates of the unknown parameters. \Vc present no new fe.sta, 
merely a new setting for the old teats which seems to put tlicm in n lit tin hotter 


1, Ce^in conditional tests. A variate or set of variates .r has a probahilitv 
density function f(x; d) under a null hypothesis involving a parameter or act oV 
parameters 6 When the parameters have a set of sunicioivt 08 timutor« i the 
joint density function of a random sample of size n may lie imt in the form 

n 

I 5). 

with “al^of® parameters. We .shall be (ion(mrnc*d 

th he class of tes criteria whmh are not functions of the estimators alone. 

a’ ^ criterion which may not be put in Iho form Xf^l 

Hh ( ^)k(d-, e), 

'‘“‘t "> * 

way that S, would have a niP^n 'K region .S'* in such a 

■ Th. ..th„s.,„. 

lu 



Ti^sxfi OP iki>ei>e:^dkxch 


115 


interoKtfxI horo only in the fact that the size of Sc cannot be determined because 
of tlu> prcsfiicf of the unknown parameters 9 in m(\; 6 ). 

One can set up a conditional test by using the conditionEil distribution /c(X | 9). 
That is, for fivcd the measure of any region R{6) (which is measurable relative 
to k(\ j Cl, say, iii the Lebcsgiie-Rtioltjes sens'') of the \ space is known because 
the 0 are known in any given instance. Thus a conditional test can be made 
with a critical region Rc0) of prescrilied size. 

Tlie conditional te.st may be interpreked as an unconditional test in the present 
instance in the following manner; the unctmditional test is made by using the 
double critf'rioii (X, d). The (X, apace is divided into two regions, Ta for 
acceptance and T, for rejection. The critical region Tc consists of all points 
(X, b) .such that X is contained in Rdb). If the size of Rc{b) is a for all b, then 
the size of Tc is also a, for 

J k(\ i b)h(b; 9) d\db ^ f fc(X \ b) dxj Mb; 9) db 

(3) «=> f ah(b; 9) db 


= or. 

In tills way one can make an unconditional test of the hypothesis with a critical 
region of prewribed size; of course one does not have complete freedom to 
siiecify the shape of Te, but he can control it to the extent that Ro{b) may be 
chosen arbitrarily for every b. Tc is of course a similar region in the sense of 
Ncyraan and Pearson (2, 3, d] for the augmented criterion, and the construction 
of T« iH essentially the same as that used by Neyman and Pearson to test param¬ 
eters with Kufricieut estimators. 

2. Application to contingency tables. As an illu-stration we shall follow 
Wilks’ [5] treatment of a two-way table with r row’s and c columns; the cell 
frequencies are n,, and the cell probabilities are p,, with 

= n; Spi; ~ Ij ^ ~ 1» 2, • ‘ , r, j = 1,2, ■ • ■ , c. (4) 

The sample is thus regarded as having come from a multinomial population. 
We let 

(5) Pi- “ 23 pf/; v-i *• 23 P</ i ^ 23 ^ 

I i 1 * 

The null hypothesis f/o (of independence) corresponds to the subspace for which 

(0) p,7 « p.5z; S p, « 1 =» S g,' 

in the parameter space of the pn . The likelihood ratio criterion for testing Ho is 

(nn,."’-)(nn,-0 
n'En"!} 


(7) 


X 



116 


A. Jf. MOOP 


and its distribution depends on the unknown paramcLers p,* anrl f/j 
the parameters have sufficient estimators 


However 


( 8 ) 


p. « ?i< /n, h = n Jn 


for the marginal distribution, of the 7ii. and n., is 

and when this is divided into the distribution of the n<, (under the null hypothc* 
sis) one finds the conditional distribution of the n,; to he 


(10) g{nii ,nii, ] Wi.,nj., • ■ • , n.c) = 


(rr n..i)(rrn.^0 

niriuyt 


which is independent of the parameters. The distribution (10) is just the, 
combinatorial distribution used ordinarily in deriving the distribution of X 
for small samples, The test for independence is therefore a conditional test 
which however may be interpreted as an unconditional test if the criterion X is 
augmented by the estimators of the parameters under the null hyjiiilbesis, 
Instead of the likelihood ratio criterion Karl Pear.Hon*s Chi-,scpiar(', crittirion 
could just as well have been used since its conditional distribution is also deter¬ 
mined by (10). 

The usual difficulty due to discreteness arises in this application to cuntiugyiu’V 
tables. It is not possible to make the significance level exactly «. In terms 
of the notation of the first section, 11^0) cannot be chosen so that it will have 
size exactly equal to a for all §. One would ordinarily replace the tKiualitics hv 
inequalities, The Ro0) would be chosen to have size lass than hut as close to ee 
as possible. The size of T. is then unspecified and one can only stale that his 
significance level is less than a. This difficulty is not particularly wrioua in 
practice unless the test criterion has only one degree of freedom, 

REFERENCES 

[11 E. A. auito, MelW, /„ oii,„ 

® '■Xnr’M 7° ?' '"'“'’1™ i-»l 

rni T M ^V-^oc. Phi. Trans,, SmcB A, Yo\ 231 n owi 

' otsteut' "Sufficieht statistics and miiroriiiJy'm,vi[, jiowcr/ul 

T * * of statistical hypotheses,” Slal. lien Memoirs Vnl. 1 

IS] S S. Wm^s, Malhmalical Statmes, Princeton 



0 SIGXIHC.VXCK LKVKLa 


117 


THE 5^0 SIGNIFICANCE LEVELS FOR SUMS OF SQUARES 
OF RANK DIFFERENCES AND A CORRECTION 


Uy Edwin G. Olds 
Carnegie Instilule of Technology 


ALddI ten ycais aRii this uutlujr publishod a paper [1], containing tables for 
use in iesfitig flui signilicanco of the rank correlation coellcient, In a paper on 
nt)n-paranietric tests, (2, p. 310] Scheff6 remarks that it would be desirable 
to have tliew* tables exteiuled by inclusion of the 5% values. When the com¬ 
putation wjiH begun it was noted that a necessary formula was given incorrectly. 
Tlie main purpu.'-t' of this note is to correct the formula and to extend Table V, 
(1, p. MS], Incidentally, a minor addition for Table III, [1, p. 143] will be 
supplic'd. 

The formula for the rank correlation coefficient, r', is given by 


» I _ 5^-4’ 


n‘ 


3 _ 


11 


where n is the number of individuaLs ranked and = 53 d! (d, being the raus. 

1-1 

dilTeu'nce fur the ith individual). As noted in the original paper, the nuii 
hypothesis, r' « 0, is cciuivalent to the hypothe.sis Dd* = (n^ - n)/6, and the 
latter hypothesis is slightly more convenient to test. ScheffS’s remark seems 
to be directed at dable V, wliicli givc.s, for 11 < n < 30, pairs of values between 
which L'd’ has a probability, I\ of being included. Values are tabled for P = .99, 
.its, .ilf), ,1K) and .80, The necessary values for P = 95 are given below and 
can easily lai copicsl in the left-hand margin of the original Table [1, p. 148]. 
TIk'.si' values, as in the previous case, have been calculated by using the fact that 


ha.H an appro.ximalely normal distribution with a mean of zero and a variance of 
(n - l)[u(a + l)/12]h In the original paper, [1, p. 142] the denominator 
in the lirackeled part of the vnirianco was printed as G, instead of 12. 

In this author’s original paper the exact freciucncies of sums'of squares of 
rank differences wore given for n = 2 to ?i =■ 7 inclusive, [1, p. 139]. The same 
results, together with the, results Cor u *= 8, were obtained (independently) by 
Kendall and ullierii and [niblished some months later, [3, p. 255]. Therefore, 
it is possible to (‘xtend slightly the comparison of approximating functions 
given in Table III, [1, p. 143], Using Kendall s results for n = 8 it is found 
that when the. approximations obtained by using a Pearson Tjqie If curve are 
I'omriared with exact results the average and maximum differences of cumulatives 
are .0013 and .OOG7 respectively. When approximations are made by using the 
normal curve the t‘orre.sponding errors are .0081 and .0163. 



118 


EDWIN Ct. ODDS 


EEPERENCES 

[1] E G Olds, “Distribution of the sums of stjuarea of rank differeticos for nuinlo'rs 

of individuals," Annah of Malh. SltU., Vol 0 (I'kiSJ, pji l.’ki-H'i. 

[2] H ScHEFFi, “Statistical inference in the non-parametrie fiiae," .iMunfs n/.VntA .''"wf, 

Vol 14 (1043), pp. 30&-332. 

[ 3 ] M. G Kendalo, Sheila P, H. Kemiall and B Babinoton Jsmith, “Tlio tlinrihuiion 

of Spearman’s coefficient of rank correlation in a iiiiiverHO in B'Jiij'li all rnnkioK** 
occur an equal number of times/’ Biomelrika, Vol. 30 (inS!!), pp 2.“»1 273 



2016 7 
2275.7 
2556.2 
2859.0 


192«.0 
2214. [) 
2c528.o 

28(5fl.H 
8240. (] 
3(H0.2 
4071.0 

4.1.15.8 

5082.8 
55(18 .K 
0181.0 



NON NnOVTlVE cjuadhatic forms 


11& 


INDEPENDENCE OF NON-NEGATIVE QUADRATIC FORMS IN ■ 
NORMALLY CORRELATED VARIABLES 

Hy Behtii. MatArn 

Farest Kf^i'nrch Insliiule., ExperimetilalfdlLet, Sweden 

In a rtH'Mifc paper by the author [5] the following theorem has been mentioned 
uithout jiroof Though the theorem is very simple and easy to prove the 
aiitlior has not found it elsewhere in the literature. 

Theohkm. If two non-negalive quadralic forms in normally correlated variables 
with ztro mrans are uncorrclaled the two forms arc independent. 

To prove the theorem, let the two forms be 

n ri n n 

(1) Qi “ OiiXiZi, Qi hij X( Xj f 

(-1 j-i 1-1 t-i 

whert! the x.’s are normally correlated and all have mean 0. By a well-known 

theorem on (piadratic forms we can reduce Qi and Qt to the forms 

n n 

(2) C/i ~ d,t/i, Qi — S d(2,, 

i-l •-! 

wdierc' the. yds and z^'a are linear functions of the XiS. In the 2n-dimen8ional 
normal <iis riliuticm of the, yds and the 2 »'a, let pp be the covariance of yt and 
z,. It is 1 hfti (wily shown that the covariance of y! and z) is 2p<,', and hence 
that 

n n 

(3) uov {Qi ,(h) “= ^ ^ ^ • 

As tlic forms are supposed to be non-negative all coefficients in (2) are non- 
nogativo. If Qi and Qt are uncorrelated, each term on the right hand of (3) 
must vanisli. Consequently, if c.- 0 and d,- 0, we must have p., = 0. This 

means tliat all yds in Qi with non-zero coefficients are independent of all z,’8 in 
Qi w’ith non-zero coefficients. Hence Qi and Q 2 are independent. Q.E.D. 

To see if Qi and Qi arc uncorrelatcd we need an expression for the covariance 
of the two forms in terras of the coefficients in (1) and the variances and co¬ 
variances of the original variables X{. Let A and B be the matrices of the two 
forms (1). Clearly we may suppose .d and B to be symmetric. Let the variance- 
c.ovariancts matrix of the Jids be L. By straightforward calculations we find 

(•1) cov (Qi, Qi) “ 2 Tr ALBL. 

Here we have used Tr M to denote the "trace/’ i.e. the sum of the diagonal 
elements in a stiuare matrix M. In case of independent variables with variance 1, 
we get 

(5) cov (Qi , Q2) = 2 Tr AB. 

The formulae (4) and (6) are given in [5]. 



120 


HEUMVNX VUN ht.'HKIXlN!; 


It is interesting to note the simplifiention of the iritlepeinlnuri' I’ninlitioii Riven 
in [2, 3] which is possible when the forms ;uc assumed to he noiMU'Rativc. It 
may also be of interest to note that the condition for inilependcin'f Riven in 
the present theorem is identical with the concspondiiiR coiidifion for two linear 
forms. (In fact, the latter condition has been u.«efl in the ahov e proof, < Fnrt}H<r 
we observe that if Qi is the square of a linear form with mean (1, w e got a ruTt-'^.^ary 
and sufScient condition for independence between u linear foiin and a non- 
negative quadratic form. The corresponditiR eoiidititm when hh i'l not Mippo.^'d 
to be non-negative has been given in [-11. 

As an application consider a quadratic form Q in normully I'orrelateii v.irial)h‘s, 
Let it be known that Q has a xLdistribution with / deRrees of fioedom. If 
further 

(®) Q — Qi "f f?! + • • • d- fjli, 

where the Qds are non-negative and mutually uneorrelaleil quadratic forins, 
then each Q,- has a x’-distribution with/, degrees of freedom, say, lUid i:/, / 

The proof with the aid of the above theorem is almost iimnfsliale W{> thus 
get another formulation of the theorem of Cochran fl] on the dwoinpo.sititm uf a 
quadratic form 


REFERENCKS 

[11 W, G, Cochran, "Diatnbution of quadratic fornio in a normal Hysteiii with apriliraiirius 

«I i T. Cpi,," MM. 

[4] M. Kac, ''A remark on independence of linear and (imulraiic fornw iavulvins indc- 
rtci n A/r 0/.S’Ittl., Vol, 1(1 (plifo ,]iat tin 

' -a 'unjo- u 2 k-'- 

1 of estimating the accuracy of line uml emnplc plot mirvmys'q 
Meddelanden frdn StaUm SkogsJorakmngainstUitl, Vol. SO tl'MT), pp. 1 


A FORMULA FOR THE PARTIAL SUMS OF SOME 

hypergeometric series 


By Hekmann von SuHELiantj 
Naval Medical Research Laboratory, New Lmulon. Conn ‘ 

same color. The probabilitv win 1 1 iT ‘ 

byalonn'iladaeto )?. EgBenbiger «nd 



iiri'KiKiKoMi/rurr i’mithl .si'.ms 


121 


ri) ^ A) ••[« + (/H-_l)A]■?;(&+A) ••■[/;+ (?i_ni-1)A] 

^ ' ^V(^' + A) ■. ■ [AT + (n - 1 )aJ 

(n fixf'fl, m variable), 

N.iW, «e li\ Hi iue! ;^k fr,r flu> prc.bability that the With black ball appears at 
the nth (hawiijR. \\ i< iiml 

(/’(’h I 

fn ■ lY*^a + A).-.l«-l-(n, - i)A]./,(/i-i-A) ■ • [6 + (a - m - 1)A] 

AW+A)---[Ar+(;r=n)Ai- 

Ihi fixed, 71 variable) 


This fmietiniJ is the (n - + l)th element of the series 


" (" 1 l) • 

A \A / 

JG' 0 


■h (rit 


-c] 

. .p 


+ («i - 1) 


b N , 

1 T ) T "r 
A A 


■4 


Ctjiiwquenfly. ific pn.lt,ability (Iiat ilierjitli hluek ball appears at the latest in the 
nth drattiiii.' ri'iids, with an (jhvmu.s ahhre.viation, 


»v 

n'b<) ' u(»< 



Now, wf as,Minn* the wsth blaek hall did not appear in the rith drawing. What 
18 the nlternntUf','' 'rhe (n — ih + l)th white ball must have appeared in the 
Hth dravMiivE at 'I'lie eorrespoiuHng probability is according to the 

etpiatmii «H'i 


ITtnl ■ ^ ii’tJ't 

fr*"'Ai i5 ^ 



h (it - «t) 

[ h, - 7ii) 

« *** 



»i d* ^ + 1 


n). 


The relation Cdl Hriginate.s from (il) by writing b instead of a and (n - ni + 1) 
instead of iii . The nlterniilivett add to certainty: 

(5) irc«) + W(n) « 1. 



122 


HERMANN VON SCIIELLING 


Change the notations in the following manner; 

(6) ny-^a, I-^(3; j + ny-^y, n - uy + Iv. 


From (6.1) and (6.4) find by addition 

( 7 ) V + a — 1. 

From (6.1) and (6.3) 


( 8 ) 

From (6 2) and ( 8 ) 

(9) 


N 


a N b o 

— -- * y — ct — p. 

AAA 


Formula (5) reads now 

(7 - a - /3) (7 - a - /3 + 1) • • • (y ~ /3 - 1) >1 

-(^-«)(7-«+l)...(7-l) ^Aa,^,7,l) 

^(d + l )--'(0 + v- 1 ) 


( 10 ) 


+ 


(7 — a) (7 — a + 1 ) • • • (7 — a + r — 1 ) 

'Fa{v,y - P — tt,y — « + v;l) «■ 1. 


F,{a, ^, 7 ; 1) denotes the partial sum of the first r elements of the hypergeomotrio 
series F(ot, /3, 7 ; 1). It is to be mentioned that a is a positive integer necessarily 
as follows from (6.1). Since 


Tr(«>) = 1 


(7 - a - ^) (7 - g — g + 1) • •' (7 - /3 -- 1) 

(7 - a )(7 — g + 1 ) • • • (7 — 1 ) 




the relation ( 10 ) can be written 


('111 Fy(a,fi,y, 1) Fcjv, 7 — ^ — g, 7 — g + y; 1) 

F^{a, 0 ,y,l) F„((/, 7 — /3 - g, 7 — g + I/; 1) “ ’ 

where v and a are positive integers. 

This result is not interesting from the standpoint of pure mathematics since 
the sum F{a, /3, y; 1 ) is known. But the relation is useful for the statisticians. 
In calculating the function TF(n) they need a sum of ni elements instead of 
(n — rii 4- 1). If ny is small (and this holds in practical applications), the 
exact calculation of W(n) is possible for every n. 


EEFBRENCES 

[1] F. Egobnbbrqbh and G, P6lya, Zeits. J. angew. Maih. und Mech, Vol. 8 (1923). np. 

279-289. ^ 

[21 H. VON ScHELLiNQ, NutuTwiss., Vol. 30 (1942), pp. 757-758, 



VUiUNcK OK I'Kol'OUrlON.S OF s UIl’I.CS 


123 


the variance of the proportions of samples falling 

WITHIN A FIXED INTERVAL FOR A 
NORMAL POPULATION 


By (5. A. H\KK!t 
Viiivdrsily of Califurnin, Dans 


StippoKf! tliat wt‘ luiv(> a normal jiopulation 


( 1 ) 


'/ 




I 

a \/2v 



Cr - mr\ 

2(7^ / 


anel wt* draw 8j»mi)lc*< of A* from tliis jiopiilation. Wo wish to estimate the 
proportion, p, u{ the ]iopu!atian betwwn two fixcsl limits, m + X<r and vi + hit. 
One way to make this eatimato is simply to count the. number of observed x’s 
which full in this interval. We shall denote this number by n. Then the ratio 


(2) n/N 

is an estimate of p* If this is done the variance of p is well known to be 
m " p) 


The method of estimating p by counting the number in a definite interval is 
nonparumotric and requires no assumption of normal or other specified type of 
sampled population for validity. However, if wc know that the sampled 
population is normal then wc may moke use of this knowledge in estimating p 
and pcmibly obtain an improved estimate. 

Another way to estimate p which makes use of tlie form of the sampled 
population is to compute 


(4) 


1 » 

Ar <-i 

s* " 52 (s;* — -s)*, 

XV i-i 


and hence the integral 

(S) 


/. 




8 


dx, 


It is implied in elementary texts that (5) is a bettor estimate of p than is (2) 
although this point is not discusstKb 

It k the punxise of the present note to discuss the variance of the estimate (6) 
and compare this variance wth (3). 

Now (S) is a function of the first two moments of the sample and it follows 
from an application of a theorem staled by H. Cram6r [1] that (5) is asymptot¬ 
ically normal with mean p and variance given by 



124 


G A. BAKER 


(C) 


2irN 






+ (e 


-iX! 



To compare the relative efficiency of the counting method with (fl) in complete 
detail would be somewhat tedious The referee suggests a brief discussion of 
the cases X = — “o, where we are counting the proportion less than some known 
value, and X = —n, where a portion out of the middle of the distribution is being 
counted. These cases are of particular practical interest 
If X = — CO , then (6) becomes 


(7) 



We choose values of n as mdicated below: 



V 

Relative EfTioiency of (3) 

-2.3263 

0.01 

0.27 

-1,2816 

0.1 

0.56 

-0 8416 

0.2 

O.CG 

-0.6244 

0.3 

0.75 

-0,2533 

04 

0.64 

0.0000 

0.5 

0.C4 


We get values of the relative efficiency of (3) that are low for small p and some¬ 
what higher for larger values of p. 

If X = —p, then (6) becomes 



We choose values of p as indicated below: 



P 

Relative EfRcienoy of (3) 

1.2816 

0.8 

0.63 

0.8416 

0.6 

0.46 

0.2533 

0.2 

0.12 


We see that the relative efficiency of (3) ranges from close to 0.75 to rather small 
values. 

Other choices of X and p yield relative efficiencies of about the same ordor of 
magnitude as those illustrated 


REFERENCE 


[1] Haiuid Cham^ir, "Mathematical Methods of SlaHslics,’’ 
1946, section 28.4, pp 366-367 


Princeton University Press, 




125 


biseriaI/ coefficient op correlation 


THE POINT BISERIAL COEFFICIENT OF CORRELATION 


By Joseph Lev 

AVr' 1 ork lilatc. Dcparlment oj Ciinl Service 


Tile prtKhu't luuuu'iit ciK’fFifU'nb of correlation between a continuous variate y 
and a variab* j which fakca flic, values 1 and 0 only, is known in psychological 
Ktali.stica as I lie point biscrial eoeflicienfc of correlation. Let y., i = 1, • • • , n, 
he observations on )/; j/i., i — L ■ • • i ui, he y values which are paired with the 
value X « 1; !/tii, 2 = I, ■ ■ , /lo, he values paired with x = 0; t/i, and t/o be 
the correspoJiding iiieans; and n = n, -f no. Then the point biscrial coefficient 
of correlation may be written 


(U 


- 5-) 

23 £ (y., - yf 

)aal ^ 


The distribution of r is readily obtained when the yt, i — 1, ■ ■ • ,n, are 
distributed aa 


( 2 ) 

where 


1 






i = 1, 2, 

f = Til + 1, Ri + 2, • • ■, n, 


v’ is the variance of the yi about the common mean a, and p is the parameter 
which represents the correlation between the y, and the Xi. It is easy to verify 
that the statistic in (1) is a maximum likelihood estimate of p. 

It will be convenient to express the two population means in (2) as pi and po 


so that 


(3) 

, . /rk 

m a + per y 

i 

Po - a ~ P<ry~> 

Hence 


(4) 

. AiTlo Pi — Po 

n V 



126 


JOSEPir LEV 


Now write 


(5) 


(iji - yd-\/n ~ 2 

y n _ 

FT it; Tt* 

Z Z ivv - y^y 

Li-o i-i J 


\/n — 2r 


where r is obtained from (1). 

Using (5) we may write t as 

(Vi ~ ya) — (Mi ~~ Mo) 


/ n 

^ ^ y raiWo 


+ 


Ml ~ Mo 


Vi - p® 


r r “I* 

Z Z (l/ij “ 

i-0 /-I _ 

71 — 2 


V TtiTTo_ 


trVl — 


Therefore i has non-central t distribution [1] with 



The methods and tables given in [1] may be used to calculate tests of significanco 
and confidence limits for p 

When p = 0, t has Student’s distribution, and the statistic ( == “v/n — 2r/ 
Vl — r* naay be used to test the hypothesis, p = 0, by means of the t tables 
with n — 2 degrees of freedom. The non-central i distribution then determines 
the power function of this test. 

Table IV of [1] can be used to calculate confidence limits for p. If the con¬ 
fidence interval is to be based on equal tails of the distribution choose a confidence 
coefficient 1 - 2e. Th en com pute 5(/, <o, «) and 5(/, fo , 1 — e), where / *• « — 2, 
and U = y/n — Irjy /1 - r^. 

A lower limit for p is given by 

^(/i 0 

[7i4-fi2(/,eo,6)]i' 

and an upper limit by 

- t) 

[n-l-5U<o,l - *)]»■ 

REFERENCE 

[ll N. L. Johnson and B L Welch, "Applications of non-central f-diatribution ” Bio- 
TMtnka, Vol. 31 (1940), pp 362-389. ’ 



MKAN UKVIATIOX 12” 

A NOTE ON KAC’S DERIVATION OF THE DISTRIBUTION OF THE 

MEAN DEVIATION 

By II. J. (W)D\vix 
Unh'erBily College, of Swansen, li'alcs 

In a patM'r uti a Kcneral elaH« of cstimatoa of doviationb, Kac [3] ohlaincd an 
t-xprewturi for thn frnqupaey function of the e.stimate of mean de%'iation from 
the mean in normal aamples. He Avas imahh' to e.stabli,sh the identity of this 
with an csprcivskm obtained earlier by me. [\\. I now simw that the two results 
are, in fact, etjuivalent. 

Kac iiws the functions defined a.s the k - fold convolution of 



I used tin* funetioiiH (h(x) defined by the recurrence relation 

(1) OoW « 1. Oi(x) = r di 

•'O 

Now I have shewn elsewhere [2] that the integral of c“*'**+ " taken through 
the interior of a ref^tlar simplex in k dimen.sions, with its centroid at the origin 
and of sidea, + 1 Cn{a/-\/2), The relation (1) corresponds to a dissection 
of the simplex into sections, which are {k ~ D-dimensional simple.xes, by joining 
the centroid to the vertices and taking sections parallel to the base of each of the 
{k + 1) amaller simplexes so formed. If however we take sections parallel 
to a bast) of the whole simplex we get another recurrence relation, viz, 

(2) (?* (x) - r 

Jo 

Now (2) may be re-written 

n* Jo n*~‘ 

whence, by induction, and the equivalence 

of Kao’s result to mine is established. 

REFERENCES 

Ul H. J. GoBwta, "Oft the distribution of the estimate of mean deviation obtained from 
aamplee from a normal population," Biomelrika, Vol. 38 (1946), pp. 264r266, 

(21 H. J. OonwiN, "A further note on the moan deviation," Bi<metnka, Vol, 36 (in the 
press). 

(3) M. Kac, "On the oharaotenatic funotiona of the distributions of estimates of various 

deviations in samples from a normal population,” Annals of Math. Slat,, Vol, 19 
(1948), pp. 267-261. 



128 


A. M. PEISEH 


CORRECTION TO “ASYMPTOTIC FORMULAS FOR SIGNIFICANCE 
LEVELS OF CERTAIN DISTRIBUTIONS” 

By A. M. Peiser 
New York City 

Professor Henry Scheff^ has recently pointed out to me an error in my paper 
“Asymptotic formulas for significance levels of certain distributions," which 
appeared in Annals of Math. StaL, Vol. 14 (1943), pp, 56-62. In the determina¬ 
tion of the significance levels of Student’s t distribution, appeal was made to a 
theorem of Cram4r which requires independent random variables, The variables 
defined at the top of page 61, however, cannot be taken as independent, so that 
the theorem does not apply. 

The asymptotic formula (following the notation of the paper) 



where 

‘^(Vp) = 1 - P, ^’(a;) = do, 

is nevertheless correct. This may be shown directly from tho distribution 
function 


nr.) II ’ + 0 ) 




Writing 


and using Stirling’s formula, it follows that G„(x) can be written in the form 




where Q„(t) is a bounded function of t and n in each finite interval. 

Let t,.„ =y^ + a„, where a„ = o(l). Then G,(f„„) = $(y,) = 1 - p, and 
we have 



CORRECTION 


129 


nan. 


^iVp + Or.) - ^( yp) 

a„ 


so that 


(Vp + OnY + i Vp + fln) 

4V^ 


g”!(»p+a„l> 



lim fla„ 


y p + Vp 
4 


Thi« is the required result. 





ABSTRACTS OF PAPERS 

(Abstracts of papers presented at the Seattle Meeting of the Institute on 
November 26-27, 19i8) 

1. Estimation of the Variance of the Bivariate Nonnal Distribution. Hahry 
M. Hughes, University of California, Berkeley, 

Let Xi and so be two random variables normally distributed with known moans mi and Wj , 
and with common unknown variance Consider an experiment in which tho observed 
variable is 7 = '\/{xi — m,)® -|- ( 3:2 — m 2 )’. This paper considers the problem of estimating 
the parameter a when the observations are grouped. By the method of minimum reduced 
chi-square with linear reetnotions, two best asymptotically normal estimates are derived. 
By minimization of the asymptotic variance of these estimates, the optimum choice of 
grouping is found as a function of <r. For two and for three groups, when it is known or 
assumed a prion that o- is on a certain fimte interval, the optimum grouping is derived 
which will minimize the maximum asymptotic variance on that interval. If such interval 
is moderately small, it is shown that the optimum grouping is the same as if a were known to 
have the value at the upper end of the interval. Finally the effect of using non-optimum 
grouping is analyzed. 


2. Derivation of a Broad Class of Consistent Estimates. R. C. Davis, Inyo- 
kern, California. 

Given a chance vextor X with cumulative distribution function F(A, fi), where 9 is an 
unknown parameter vector, a broad class of estimates of 0 is derived having the fiillowiug 
properties' a) any estimate in this class is a consistent estimate of 9; b) any estimate is a 
symmetric function of independent observations of tho chance vector i. The novel feature 
of this class is that no assumptions about the existence of various partial derivatives of a 
density function with respect to 9 are made. As a matter of fact not even the existence of 
a density function is required, and it is postulated merely that a continuous function of X 
for each 9 (in a certain neighborhood of the true 9,) and of 9 for each X exist which satisfies 
a Lipschitz condition in 9. For each such function having a finite first moment an estimate 
of 913 constructed which has the properties a) and b) listed above. 


3. Locally Best Unbiased Estimates. Edwarb W. Barankin, University of 
California, Berkeley. 


Let V = 1pj(x); 9 « 0) be a family of probability densities in the space B of points a- 
and g a function one Let s be fixed and >1; call an unbiased estimate of g best at Ot if its 
s-th absolute central moment (s.a.c.m.) under pj, is (finite and) not grnator tlian the 
S;a c^m., under of any unbiased estimate of g. With a certain integrability postulat! 

pxiHt sufficient condition, of finite character, is oatahlislied for the 

e.xiBtence of anunbiased estimate of g having a finite s.a.c.m. under pj,. When such ft one 
exists, there then exists a umgue unbiased estimate which is best at Si,. The existence 
condition defines the s.a c.m. of the best estimate explicitly as the l.u.b. of a set of number 
in particular we obtain immediately a generalization of the Cramdr-Eao inequality Also’ 
when it exis s, the best unbiased estimate is explicitly constructed from the funetbn 0 


130 



ABSTRACTS OF PAFEHS 


131 


‘I. Some Problems Related to the Distribution of a Random Number of Random 
Variables. IObwaud Patlson, University uf Washington, Seattle. 


Let 1J, 1 (i “ 1,2, • ) lie a het of indejK'ndent random vaiiables with identical distribu- 

tiona, with/i'fa-) =. aando*(j:) = i» (0 < 6 < oo) LetA^bcapositiveiritegral-valued random 
variable with distribution l'\(N) depending on a parameter where li{N) = Av , and 
ir®(i\ H), (0 < Now let 3W “ + aij + ■ • ■ + iw The limiting distribution 


as A 


of 


has been investigated by Ilobbins (Proc, Nal. Acad. Science, 


Tti — oAx 
VnVlx + btU 

\ol. .It (April lUlH), pp, 102-10.3) (or aoveral different sets of conditions on the distribu¬ 
tion of . It can be shown that analogous results will hold if instead of 7'n we consider 
a more general statistic 3'y , whose conditional distribution with respect to the variable N 
is such that there exist constants a, and hi so that 


11 n\ 


E / exp 


\ VbiN /_ 


= HD 


uniformly in any finite t interval, where h(J) is a characteristic function. Returning to the 
statistic 3\' , it can he shown that there exists an asymptotic expansion in powers of X"t 


with remainder for P 


L 


\ < w) when the following conditions are 

(Vn’Rx -b 6Ax J 

satisfied, (1) the distribution function of x has a non-zero absolutely continuous compo¬ 
nent, (2) /i(l X and (3) X —> « througli integral values, and F\{N) is the Xth con¬ 

volution of a random variable n such that < » 


5. Asymptotic Expansions for the Distribution of Certain Likelihood Ratio 
Statistics. Ai.itERT n. Bowkku, Stanford University. 

Asymptotic expansions of the “Oramorian” typo are derived for the distribution of 
likelihood ratio statistics given by Wilks for testing various hypotheses about means, 
varianccB, and covarianceH of a normal multivariate distribution. The point of departure 
is Wilks’ result lliat minus twice tlie logarithm of the likelihood ratio has thex’ distribution; 

terms in i, • • may also be expressed in terms of thox* distribution, In addition, 
A A'^ 

asymptotic exjianaicma of the '‘Fishor-Cornish” typo arc obtained for the percentage points 
and for a Iransformation of the statistic to ax’ variate. 


0. On a Problem of Confounding in Symmetrical Factorial Design. Esther 

Seiden, Univei’sity of California, Berkeley. 

Let ms(r, s) be the maximum number of factors that is possible to accommodate in 
symmetrical factorial experiment in which each factor is at s “ p" levels (p being any 
positive prime number, n a iiositive integer) and each block is of size s', without confound¬ 
ing any degrees of freedom belonging to any interaction involving 3 or lessor number of 
factors, 

R, C. Bose, proved in a paper ‘‘Mathematical theory of factorial design,” SankhyS, Vol, 8 
(1947), pp. 107--lft6, that the following inequality holds. 

8 ’ -I-1 ^ *) S «’-t- 8 + 2, This gives in case 8 - 4,17 ^ wn(4, 4) ^ 22. It is now 

proved that ma(4, 4) = 17. 

The proof consists in showing that the maximuln number of non three collinoar points 
which, can be chosen in a finite projoetivo space 3'’G(3, 4) cannot exceed 17, which according 
to a proof of R. C. Bose is equivalent to the ataement that m (4,4) cannot exceed 17. 



132 


A13STRVCTS OF PAPERS 


7. Some Bounded Significance Level Tests of Whether the Largest Observa¬ 
tions of a Set are Too Small {Preliminary Report) John K Warmi, Santa 
Monica, California 

A Set of n oliaei'valinna arc given which satisfy (1) They are indciH'ndciif iind fnuii 
continuous syminctrlcal populations; (2) The r hugest, nhsei'Vation.s are fioin iinpiihiliitiis 
with median 0 while the remiumiig observations are from poimlatiiuis witii meiliim .f It 
is required to test whether 0 <.ip. Let nfll, •• , a.(n) di’iioLe the ohserviiliKiiK airiiiiged 
in increasing order of magnitude For r = 1 tests of the form. Accept 6 <,<,-< if x{nl < 
2x(Wa) — x(t), where a = Pr[a;(n) -f x{t) < 2 0 I 0 = ip] and w„ »» (he ewciUcut inlcgrr 
satisfying Pr[a;(wo) > 0 \ d = <p\< a, canbcolitaiiicd from n > 15. KviicL Higniliriiiier h'Vrtls 
can be obtained by assuming a sample fioin a specified poiiulatiiiii (e.g iioniuill. On the 
basis of (l)-(2) alone, the eigmficanee level never exceeds 2a( For large n. tests ran lie tih- 
tainedfor any r if the observations satisfy the additional weak condition. Ci i ■f'lie fail order 
statistics are appro.ximaloly independent of the central ordcu dlalislirs; also (lie variance.., 
of the tail order statistics are either very largo oi very small comjiared nitli flu- viiri 
ances of the central order statistics. The test is- AcrepL 0 < vs i/iiia\[j:fi,) 1 ijn -- 
I <k < s < t ] < 2a;(ic„),uifierct,. < fu+i , j> <ii+i ,J. “ r — 1, le,. is the t^mallrtil nilfj/t r saf. 
isfying Pt[x(wA > 0 I 0 = ipj £ a, anda-^ Pr(inax(j:(!i) -|- x(n - ji)| <29 \ 0 = vi. For 
large n the sigmficance levol is approximately a but is < 2o for all n. The power fuiirlion 
-ilasvs — and —>0asvp — ()—> — «>, 


8 Determination of Optimal Test Length to Maximize the Multiple Correia- 

tion. Paul I-Iorst, irnivcr,sity of Wtisliinglon, Soul tic. 

If the lengths of the tests in a battery arc altered, their inUircorrelatioim and tlieir vnlitU- 
ties or correlations with a criterion are also altered C.'oiiscciuenlly, tliuniiilliple rorrelalinn 
of the battery with the criterion will also bo altered. These changes arc a fuiielion rif I he 
reliabilities of the tests. Suppose we have given fiom a set of oxperiinenlal data fl) Die 
time allowed for each teat in the battery, (2) the reliability of each test, (II) the inlereomda- 
tions, and (4) the validities of all the tests. If we specify the overall testing lime we, are 
willing to allow foi the test in the future, we can dclcrmiiie tlie amount by which eueh lest 
must be altered m oi der to give the maximum multiple correlation witli i he cri tenon. The 
method, togetlier with numerical examples and the mathematical proof, is preaented. 


9. Some Numerical Comparisons of a Non-Parametric Test with other Tests. 
P. J Massey, University of Oregon, Eugene. 

Let F{x) be the cumulative distribution function of a R.V.A", ami let aii < x, • ■ • < x 
be the results of n independent observations ordered as to siite. 

Define <?„(*) = 0 if a; < x,; 

h 

= - if xt < I < x* -b 1, 


= 1 if In < X. 

To test the hypothesis Ih • F{x) ~ F^{x), where Fo(x) is completely aiieeilied, usu (he 
•rttorion I > a.,,, 

mmlrioV" *®""‘ c.n l„ o.kulal.d 



\UKTH\CTS OP PAPERS 


133 


10. On the Deviation of Extreme Values. W J. Dixon, University of Oregon, 

}Mlg(‘Ilf'. 


Let *(ij be the ith olmervatiou in order of niaf'iiitude in a sample of si/.e n, Thu dialu- 


hutifin of ll 


xf2). 


is obtained c.\i)lieilly for .samples from a rectangular disfiibulimi 


r{n) 

x{n) — x(l) 

and for n s-' 11, 1, for Hamples from a normal distribution. Percentage values of /f for 
values of n up to :ifl are iiresented (.icneralijuvtions of R arc indicated 


11. The Optimum Size of Interval for Making Measurements of a Rocket’s 
Angular Velocity. lOnw tun A. Fay, University of Clalifoinia, Berkeley. 

Over a given range of timeO < r < 7', the angular velocity of a rocket’s spin is adequately 
repieaented by a polynomial ^(t) of given degree s — 1 but with unknown coefficients, 
The rockel’a angular acceleration and the angle through which it spins in a given time- 
interval may then he obtained respectively by differentiating and mtegrating fW. Let 
H be an integer ^ s, let i « 7'/u, and let a. be tlio angle ttiiough which the looket turns m the 
Interval (i ~ 1)1 < r < it. 'VVlnle ((r) and {'(t) cannot he directly observed, the angles 
iji , 113 I ■ • ■ I >J' can. Let P. be an observation on tj, , and assume that Fi , Pa , ■ ■ , P.' 
arc independent hoimmcedostic variables. The F. may then be cemliined by the method 
of least squares to obtain best linear estimates A'(t, t) and X'(t, l) of £(t) and {'(t) The 
ehoko of i is at the observer’s disposal. For the cases s => 2, 3,4, and for the. oases that the 
common variance of the F, is (a) independent of l or (b) propoi tional to I, an expression is 
derived for the variance of A’'(t, i), and the maximum value of that variance over the 
range of r is ininimisod with respect to 1. The method is of muoli more general application. 


12. Stationary Time Series j^nalysis and Common. Stock Price Forecasting. 

Zenon Szatrowski, University of Oregon, Eugene. 

The ohjeelive of this paper is to present a statistical procedure of practical valuo in the 
problem of pxlraetiiig informalioii from the past hchaviour of economic time series, informa¬ 
tion to he used in projecting future patterns. Tile author feels that his approach yields 
results closer to reality than the toehniques described by Herman Wold, M. C. Kendall, 
n. T. Davis, and in parlioular, the technique of "disturbed harmonics" used by G. TJ. Yule. 
The idea of the proposed technique can bo desoribod by examining the autoregression 
scheme, which seems to be considered the most desirable by the above men. A simple 
example of such a scheme is the equation 

W(+i " — bui + Ei+i, 

where the u’s are the time series values and E'b are random elements. The above linear 
relationship, when determined either directly or through an empirical correlograra (for 
which data is usually inadequate) is a kind of an average relationship. It may bo m inap¬ 
propriate in estimating future values of a time series os would be an average in estimating 
the level of a series with a pronounced trend. 

The author proposes using derived time series to slied light on the nature of the changes 
in the parameters under consideration. Such derived series could bo estimates of the o s 
and b'a for successive time periods. The author has found that projeotions of common 
stock price fluctuations were improved considerably when the changing nature of the 
cyclical pattern was taken into account. This was done by constructing derived time 
series, "moving" ostimalos of the amplitude, period and phase of the dominant harmonics, 



134 


ABSTRACTS OF PAPERS 


The author points out that the above approach has shown promise m conininrluy prieea 
as well as common stocks The valuu of this approach in forpcasliii(' lies in the facts that 
(1) it does not require forecasts of othci sciics and (2) it is based on the realistic assiiiiiiition 
that history repeats itself but with variations, variations which may he taken into account 
through appropriate models 

13. Distribution of the Number of Schools of Fish Caught Per Boat. J. NnY- 
MAN, University of California, Berkeley. 

Let X be the average number of schools of fish per unit area of n lushing ground .1. Ia’I n 
be any area paitial to A, and let fl(rti, a, X) denote the probaliilily that e\nctly n sehouls uf 
fish will be found within a, At time t = 0 a boat begins scouting for (ish in .1 traveling ivt 
constant speed ti. It is assumed that all sehoola of fish witliin distance r of the boat are 
detected and none is detected at a greater distance. If » S 1 schools arc detected tlien they 
are caught in turn, the catohing of one school taking up eKactly h houis. X (1) denotes the 
random variable representing, foi each t ^ 0, the iiuniher of schools caught up to time i 
including the one which may be in the process of being caught at the moment I . I’rohahdily 
distribution of X (i) is given by the formula 

k 

PlX(l) < fcl = S film, 2 rv(l - Wi), X] 

m**o 

for ft = 0,1, 2, ■ • • I n - 1, nliero n — 1 is the greatest integer smaller than l/h. Of course 
■PlX(t) < n) = 1. This result is easily generalized for till' joint diBlnlmlion of eatclies of 
several boats fishing in the same area so that their paths do not cross. Assmiiing apucihc 
functions to represent n(m, a, X) formulae may ho obtained to estimate lhi> paraniclers X 
and rv, 

14. Some Problems in Fishery Research to which Statistical Methods are 
Applicable (Preliminary Report). Ralph P SiLLittxN, TI. W Kisli uutl 
Wildhfe Service, Seattle. 

One of the most diflioult problems is the obtaining of a random sample of a fisii popula¬ 
tion Rarely are such populations randomly distributed over any area, and the samiiloH 
must often be taken from the catches of fishing vossels which do not uniformly cover even a 
part of the area of distribution of the population. Many distributions of variables found in 
fishery research are not normal, and statistical methods based on the normal distribution 
can he applied only through the use of unsatisfactory transformations. Since fishery 
research is largely observational in technique, data reflecting the cOneurrent cffccL of 
several variables are usually obtained Although the present methods of multijile correla¬ 
tion and regression can be used in some instances to measure the relative offec.t of the 
separate variables, there are many situations in which these methods do not yield good 
results. Finally, many data used in fishery research must he adjusted befare use, and 
existing methods do not give good measures of the expeotod variability of sucli adjusted 
data. Examples of speciflo problems arc found in the distribution of deliveries and the, 
variations in catch of Columbia Rivor chinook salmon. 

15. The Application of the Hypergeometric Distribution to Problems of Esti¬ 
mating and Comparing Zoological Population Sizes. Douglas Chapman, 
University of California, Berkeley. 

Estimates and tests of the type, as developed by Neyman, are adapted to sampling 
without replacement from a finite population. These results are applied to problems of 



ABSTRACTS OF PAPERS 


135 


estimation and cornparison of soological population sizes as determined by sampling pro¬ 
cedures. For single samples the bias and variance of different estimators is compared. 
Finally some numerical calculations are made for various population and sample sizes to 
determine how different sample sizes and different methods of analysis affect the size of the 
critical region which is necessarily an .approximation to the desired size. For some of 
theso-the pow er of the test is considered. 

IG. Extension to Multivariate Case of Neyman's Smooth Test with Astronomical 
Application. Eeizabeth L. Scott, University of California, Berkeley. 

It IS more or less generally accepted that the distribution of extra-galactic nebulae in 
Space is not uniform in the small. In particular, counts in small cubes show distinct signs 
of contagion. On the other hand, it is not settled whether or not lack of uniformity in the 
large exists. One way of making this statement precise is to assert that the power senes 
expansion of the logarithm of the probability density of the two angular coordinates of the 
nebulae within a given large area on the umt sphere does not contain low order terms. 
In fact, any such low order terms could be interpreted as determining “trends” or what 
could be described as lack of uniformity in the large. From this point of view, uniformity 
in the large may bo tested by a two dimensional Neyman Smooth Test for goodness of fit. 

Let (irijCe:, y)\ be a sequence of polynomials in x and y ortho-normal for I ® I < a and 
I V I < ii. If ** and yt are the coordinates of the fcth out of n nebulae counted within the 
reetanglc (—a,a), (—&,6) then the smooth test of mth order consists of rejecting the hypoth- 

csia of uniformity in the large when SIS ) S nxj whore xj is the tabled 

.+/-1 \*_i / 

value of x’ with m(m -f 3)/2 degrees of freedom. 

17. A Mathematical Theory of Vitamin A Metabolism in Fish {Preliminary 
Report). Normal E. Cooke, Vancouver, B.C. 

Several possible liypotheaos for vitamin A metabolism in fish are developed frora_ simple 
postulates. These hypotheses are tested (by least squares method) against experimental 
data in an attempt to deduce the correct mechanism. 

18. The Interactance Hypothesis between Populations. Stuart C. Dodd, 
University of Washington, Seattle. 

The hypothesis of interacting between human populations, or of demographic gravita¬ 
tion, is that the number of interactions between two oommunities (or other groups) tends 
to vary directly with the product of the two populations and their "speoific coefficients” 
and the overall duration and tends to vary inversely with the intervening distance and the 
average duration of an interact. The hypothesis is tested by isolating factors and measur¬ 
ing their correlation with the amount of interacting in the pairs of a set of iV oo mm u n ities. 

This hypothesis is supported by studies of telephoning; news oiroulating; travel by bus, 
train, or plane; R. R. express; college attandanoo; intermarrying; etc. Further lists of 
intorhuman actions are suggested for investigation. . ^ ; 

A new Corroborating bit of data comes from a poll by the Washington Public Opinion 
Laboratory in a Seattle housing project where negro-white relations threatened violence. 
The tension units of verbal interaction (defined as one anti-negro opinion asserted by one 
white person) wore observed to decrease inversely with a power of the distance from a rape 
site. The observed tension correlated with the formulas or curves predicting tha,t tension 
at p »» .04 and passed the chi-square test at the one per cent level. The tension is dimen¬ 
sionally analyzed as a social force and social energy. 



136 


A,EBTRA.CT8 op papers 


19, Tile Employment of Marked Members in the Estimation of Animal Popula¬ 
tions. Milner B. Schaepee, U. S Fish and Wildlife tiervici*, Ilonoliilu, 
T. H 

The estimation of population, numbera by marked membora la an important techniciue in 
fisheries research The number of individuals in the population, of Which 7* are known 
to be marked, may be estimated from a sample of n of which t arc found to be marked. 

7l jT 

Several estimates are available, all of which reduce to ~ when the numbers arc all 

large, but more precise formulae should bo used when the numbera are not all large. An 
estimate of the variance of N has been derived by Karl Pearson (Btomelfika, Vol. 20 (i928J, 
pp. 149-174) on the basis of inverse probabilities. The sampling error may also be measured 
by means of oonfidenoe intervals Formulae have been developed for estimating .V from 
repeated samplos of the same population, but no very suitable estimates of the sampling 
error are available in this case. For some migratory fishes marked at a point on their 
migration path and sampled later at another point, there exists a correlation between time 
of marking and time of recovery in the subsciiuent samples. In such case, the total number 
of fish marked or drawninthe subsequent samples cannot in general bo regarded as random 
samples of the population. Where numbered tags are employed as marks, so the fish may 
be individually identified both when marked and recovered, a method of estimating iV in 
this case also is suggested 

20 Non-Response and Repeated Call-Backs in Sampling Surveys. Z, W. 
Bibnbaum and Monroe g. Sirken, University of Washington, Seattle. 

In opinion-polls and other sampling surveys, a responao can only bo obtained from those 
individuals of a sample who aro available for interviewing. Lot p,i bo the probability that 
an individual chosen at random from the population answers "yes” to a question, pi, that 
an individual is available for interviewing, and Pn that an individual is available and 
answers "yes.” Usually one wishes to estimate the parameter p,\ , but from a siimplo it 

IS only possible to estimate —‘ = p' = the probability that an individual answers “yes" 

if he is available. Thus the total error m estimating p.t from a sample contains two com¬ 
ponents'. the bias p,i — p' and the sampling error. In this paper a technique is presented 
in which individualB not available at a call are called upon repeatedly, up to fc limes. It 
is shown how, for a given upper bound of the total error at a prescribed probability level 
and a given k, it is possible to minimize the cost of the survey by optimizing the relationship 
between the greatest possible bias and the sampling error 


(Abstracts of papers presented at the Cleveland Meeting of the Institute on Dcccnibor 

27-30,1948.) 


21. A Necessary Condition for a Certain. Class of Characteristic Functions 

{Preliminary report). Eugene Lukacs, NOTS, Inyokern, California and 
Our Lady of Cincinnati College, Cincinnati, Ohio. 


Let p{i) = 1^1 Ds) ~ reciprocal of a polynomial without 

multiple roots. The following necessary condition is derived which w(0 haa to satisfy in 
order to be the Fourier transform (oharaoteristio function) of a distribution. 



AKbTK.A.CTrt OF PAl'EHK 


137 


If sr(() ifi lh(‘ Fourk'r Iransforni of a dUtnhullon, then 

1) haw liii real roolfs. If h -1- lo (a 0, 6 Oj is a root then —6 + to is also a root. 
That is the roots of ^c(t) are either loeated on the imaginary axis or are symmetrical to this 
axis. 

2) If h + to (a 0) is a root then there exists also at least one root ta so that sign « = 
sign a and < « ( S ' o I . 

As a particular ease one obtains the well known fact that (1 4-cannot be a character¬ 
istic ftinetioti. 

22. Precision of Estimates from Samples Selected under Marginal Restrictions. 
{Prvhmimry Report). Cliffobd J. Maloney, Camp Detrick, Frederick, 
Maryland. 

Formulas are derived for estimates and tor their variances computed from samples drawn 
at random aubjeef only to marginal reatrictions from populations classified by several 
charaelera, and estimates arc made of the efliciene.y of such sampling plans compared to 
sampling with complete stratification or sampling completely at random, By means of two 
simple but general theorems it is shown that the variances are independent of the individual 
values of the character being sampled for in the population and in the sample and depend 
only on the first iw o momenta for each cell of the population. It is shown that in the large 
sample approximation a practical scheme for actually drawing such samples can be obtained 
by drawing a sample of size n entirely nt random and using the results of Deming and 
Stephan tAnnaU of Math. tSial., Vol. 11 (1940), p. 427) to adjust the sample marginal totals 
to the specified values. Deficient cells will of course be filled up by additional drawings. 
A measure is given of the relative loss of information in sampling with marginal restrictions 
on the sample cell nurnhors compared to sampling with complete stratification. If oj/ 
represents the population mean in the yth cell, r, the population mean m the tth row and 
Cl the population mean in the jth column, and if ai, is of the form a<i a + n + a , then 
marginally restricted sampling is as efficient as sampling with complete stratification. For 
arbitrary «j, a measure of the relative eflicionoy compared to sampling completely at ran¬ 
dom is given by the relative degrees of freedom for the sample cell numbore. A compari¬ 
son with other poasible sampling procedures is given. 

23. Properties of Maximura- and Quasi-Maximum Likelihood Estimates of 
Parameters of a System of Linear Stochastic Difference Equations with 
Serially Correlated Disturbances (Preliminary Report). Herman Rubin, 
Cowles Commission, Tlie University of Chicago, 

IiCt “> be a complete system of linear stochaatio diSerenoe equat'ons, xt -> 

(Vi I )5i)( J/i jointly dependent, predetermined. Let us suppose u', -f- -> v', , where 

the random veotores w, are serially independent and have moan zero. If the vectors vi 
have the same Gaussian distribution, and the system is identified, we can obtain maximum- 
likelihood BStimatesi if the distributions are not identical Gaussian, quasi-maximum-likeli¬ 
hood estimates result. The identification problem is a special ease of that with independent 
«( and bilinear restrictions on some A*, , If the rostriotions on Aj, are linear or bilinear. 
As in that case, wo may have multiple idontifioation. However, the spooiol aspects of this 
type of system yield some help in the discussion of the identification problem. We also 
observe that it the system is identified, wo obtain consistency and asymptotic normality 
of tho estimates under the same conditions as with sorially independent u’s for Au*. 

24. The Computation of Maximum Likelihood Estimates of Parameters of a 
System of Linear Stochastic Difference Equations with Serially Correlated 



138 


abstracts of papers 


Disturbances. Herman Cheenoff, Cowles Commission, The tinivpraity 
of Chicago 

Consider the structural equations = u\ where the vector Zi «• (i/i , Jul, yi an* the 

jointly dependent, and Si the predetermined variables and where wi are serially correlated. 
In particular assume that the disturbances ui satisfy the simple Markoff Process 
uj + = cj where «i is a stationary serially uncorrelatad Gaussian Process with aero 

mean. Then we have = «| . The estimates of Dt« and I5'(s,'t’,) can be 

simply expressed in terms of those of Au* . It is shown that iterative gradient methods of 
maximization require about 2 to 3 times as much work per iteration as m the serially un¬ 
correlated case. To apply the Newton Method about 8 times as much work per iteration 
IS required. The Newton Method uses the second order terms of the expansion of the log 
of the likelihood in terms of the independent parameters of Aur and these can be used to 
obtain estimates of the asymptotic covariance of the estimates. 

25. Test Criteria for Hypotheses of Symmetry and Definiteness of a Regression 
Matrix for Demand Functions. Uttam Chand, University of North 
Carolina. 

The importance of relations between two sets of variates (e.g, the study of relations of 
the prices to the quantities of several commodities) invariant under linear transformations 
of one set of variates oontragredient to those of the other was first pointed out by Hotelling. 
In the study of related demand functions no suitable statistical tests have existed for 
testing the hypotheses of symmetry and negative definiteness of the rogreaeion matrix of 
prices on quantities. The test proposed here for the hypothesis of symmetry is exact and 
invariant under all oontragredient transformations. A. separate tost studied for both 
symmetry and negative definiteness satisfies the property of invariance but Lis distribution 
depends on a nuisance parameter which is the non-zero root of a cortun dolerminantal 
equation. The likelihood ratio criterion under the hypothesis of symmetry leads to a multi - 
lateral matric equation which represents ip(,p+l) equations of the third degree in lp(p+I) 
unknown regression coefficients for the p-variate case, and does not admit of a unique 
solution. 


26. The Distribution of Extreme Values in Samples whose Members are Sto* 
chastically Dependent. Benjamin Epstein, Wayne University, Detroit. 

In this paper the following problem is considered, To find the distribution of largest 
and smaHeet values in samples of size n drawn from a random process subject to the follow¬ 
ing conditions: 

(i) observations * 1 , * 1 , ■ ■ , x„ are taken in order from some random process. 

( 11 ) the random prooese is such that sucoessivo observations xt and xf+t are jointly 
dependent. The joint distribution is desoribod analytically, independontly of i, 
by a two-dimensional d.f. 

Ft(x, y) = Prob (xf ^ s, i ^ i ^ n - I, 

(iii) Ftix, V) =- Fi{y, x) 

(iv) Any other pairs of observations (x,, x,+,). 1 ^ ^ n - 1, 2 < j < n - I, are as- 

sumed to be independent, 

The results in this paper generalize the special situation where all observations are inde¬ 
pendent. More general oases than those covered by (i)-(iv) are briefly considered. 



ABSTRACTS OP PAPERS 


139 


27. On Age-Dependent Stochastic Branching Processes. Richard Bellman 
AND liiEODORE E. Hauris, Stanford University and The RAND Corpora¬ 
tion, Palo Alto and Santa Monica, California. 

An initial jiarticlQ has a random life length T with o.d f. (?(/). At death it ia replaced 
by a random number .V of similar particles; P{N ^ n) <= q„ . Particles produced have the 
same diatributions of life-length and replacement as the original one. 

M 

X^et z(l) “ number of particles at timet, h(») = S Pfs, 0 >= S P(zil) => n)s". The 

^ n-0 ri-0 

integral cciuation F(b, t) ». / h[F{a, I - j/)) dO(y) + 8[1 - G(t)l uniquely determines 

f ( 8 , (). When, suitable restrlotiona arc put on h{a) and 0(i), results of Feller can be applied 
to study the asymptotic behavior of the moments of z(t), which satisfy linear integral 
equations of the convolution type, and further special results on the moments can be 
obtained. The condition Sng, > 1 and certain further restrictions insure that z(i)/e*‘ 
converges in probability as f , where 6 is a certain constant The m.g.f, i^(s) of the 

limiting diatribution satisfies the equation ^(s) =■ h[ 0 (se-<’i')] dG(y). Further restric¬ 

tions imply that^(a) is analytic in a neighborhood of s - 0 , and that the corresponding 
distribution is absolutely continuous. 

28. Cuboidal Lattices, G. S. Watson, Institute of Statistics, North Carolina. 

Vatea has given two series of partially balanced incomplete block designs, square and 
cubic lattices, which enable the experimenter to test respectively fc’ and varieties in 
blocks of siso k. Harahbargcr has recently given a series of designs, rectangular lattices, 
which sufiplomont Yates’ square lattices. 

In this paper two senes of designs are given called cuboidal lattices, supplementing the 
cubic lattice sorios. They may be used to tost respectively -h 1) and k(k -p !)• varieties 
in blocks of A, when the number of reflloations ia a multiple of 3. Interblock information 
may be recovered. The first series has a relatively simple analysis and should prove useful. 
This work was sponsored by the Office of Naval Research. 

29. Transfonnations Induced by Series Approximation of Prior Probability 
Amplitude. Archie Blake, Office of The Surgeon General, U. S. Army. 

Consider a class A of mutually exclusive and exhaustive possible outcomes of a test. 
(We assume A finite; this condition can under suitable conditions be removed at a later 
Stage by a limiting process.) For a hypothesis h, let u be the vector whose value, for each 
member a of A, is the square root of the prior probability of a and h jointly. This vector 
is called the probability amplitude; its norm, the scalar product u'u, is proportional to the 
prior probability of h, the constant of proportionality being determined by comparing the 
norms of the u'b for all A. Let the tost leave the alternatives of a subclass S of A still 
possible, while ruling out the members of A — B, Represent this tost by a vector r having 
the value 1 on B and 0 on A — B. Define d on AA ns a matrix equal to r on the main diag¬ 
onal and »ero elsewhere, The posterior probability is proportional to the form value 
u'du, the norm of the projection of « on a aubspaoe determined by suppressing the co¬ 
ordinates of A - B. Consider the transformation v ^ (v, I being a matrix on AA and v 
a vector on A, Then u'du takes the form v'l'dCu. Denote I'dl by a. If n is approximated 
as a patiiai sum of the series tv, i.c. by truncating v with a subclass C of A, the truncation 
induced on e is that with the minor on 00. (How much of the prior probability norm is 



140 


ABSTRACTS OF PAPERS 


retained with a particular truncation is moat easily seen if I is orthogonal, frir then the 
transform of u'u is u'e) 

For example, m an agricultural experiment, let A be the composite of P, the rlass of 
plots, and Y, the class of possible yields on a plot Then u takes the form of a seeond 
order tensor or matrix on PY, while d and I are fourth order tensors, For some member 
y of Y, it often happens that some of the initially most probable, numerous, and ecimoinic- 
ally consequential hypotheses will be such that for them the values of u{y) are prednnii- 
nantly high on some row of plots, low on another row, etc, The transformation u — iv 
on P induces the transformation e = t'dl; this is R. A. Fisher’s transformation, performed, 
however, on d instead of on the yields themselves. The truncation of v and r eorreaiionds 
to Fisher's relegating the higher interactions to error. This calculation may bo accom¬ 
panied by a linear transformation on Y, e.g. in series of orthogonal functions (.S’uch 
series are not subject to the disadvantage of classical Gram-Charlier sone.s, wliieh are, 
expressed m terms of the probability instead of its square root, that their partial sums 
can be at places negative ) 


30. On the Utilization of Marked Specimens in Estimating Populations of 
Flying Insects. Cecie C. Craig, University of ^Michigan, Ann .Arbor. 

The experimenter catches flying inSocta, say butterflies, marks and immediately releases 
them It IS assumed that all the insects in asegregated area are equally liable lo rupture 
whether unmarked or marked, even several times, and that the population is at aide for this 
period over which a senes of captures is made. From the record of inseetH eauglit once, 
twice, three times, and so on, the problem is to estimate the total population. Two malho- 
matioal models which seem appropriate are considered and four methods of estimation arc 
compared with respect to the large sample variances of the estimates they give. 


31. On a Probability Distribution. Max A. Woodbury, UniverKity of Micbj. 
gan, Ann Arbor. 


In this paper the probability of x successes in n trials of an exont is computed for the 
case when the probability of success in a given trial depends only on the number of provi ous 
Buccesses. The solution P(n, x) satisfies the equation of partial diireronces 


Pin + 1 , s + 1 ) = (g - qAP(n, x) + q,^\P{n, z -f 1) 
in the case when 3 = 1 . The boundary conditions are obviously P( 0 , 0 ) landPfn z) 0 

for z < 0 or >a + * The solution of this equation is obtained by use of a generatiiig fune- 
® expansion of 3 " by means of Xewton’s 

3 = tLe oZ^VZrt^r 9 ;. Specifically, by setting 


P(n, X) - S 3;/[(5. _ ~ 3 *) 1 . 

In the case p, = po one has the result 


P{n, x) 


Pi ^ 
zl dg* 



which yields the usual result on simplification. 


G. W. Brown 


32. Distribution-Free Tests of Data from Factorial Experiments. 

and a. M Mood, Iowa State College. 

A device for avoiding the assumption of normality in analysis of variance problems was 



ABSTRACTS OF PAPERS 


141 


developed by M. Friedman (^?n Slat /Issoc./our., Vol. 32 (1937), pp 675-701) m which the 
values of the observations were replaced by their ranks 

An alternative approach is presented here in which medians are used to construct certain 
contingency tables, and the various null hypotheses of interest are easily tested by means 
of the 01 dinary chi-square criterion applied to such tables. These tests- 

(1) Avoid the assumption of normality. 

(2) Are particularly sensitive to differences in locations of cell distributions but not to 
their shapes. 

(3) Usually require very little arithmetic computation. 

The tests and the relevant distribution theory have been worked out for some of the 
simpler experimental designs. 


33. On Sums of Symmetrically Truncated Normal Random Variables. Fred 
C. Andrews and Z. W. Birnbaum, University of Washington, Seattle. 

Let Ao be the random variable with the probability density 

= tor |Xl<a, faiX)=0 for I X I > o, 

T1 

and let where , ••• , are independent determinations of X„. 

The problem considered is, for given n, T > 0, « > 0, deterrmno a such that 
P(1 iSj,"' I > T) = <. The exact solution of this problem would require laborious computa¬ 
tions. In this paper a method is given for obtaining approximate values of a which are 
“safe”! e such that P(l I > T) < «. 

34. On the Foundation of Statistics. Max A. Woodbury, University of 
Michigan, Ann Arbor. 

The results on this paper are part of the author’s University of Michigan dissertation, 
“Proliability and Expected Values ” The work covered by this paper was sponsored by 
the Oflice of Naval Rcseaich One may take the notion of an expected value as the basis 
for the theory of Statistics; i c. a linear functional on a linear space of random variables 
(real valued functions defined over a population). The space is called statistical if it con¬ 
tains all constant functions and the expected value of such constant functions is just the 
constant and if the expected value of a non-negative function is non-negative A statistical 
space is called strong if it contains with a random vaiiablc also the random variable whose 
values are the absolute values of the given random variable. Every expected value defines 
a probability measure over a quorum of subsets of the population and it is shown that the 
integral of the random variable, if it exists, coincides with the expected value. Further 
It is shown that if the statistical space is strong the integral necessarily exists and also 
that a necessary and sufficient condition that the quorum be a field is that the statistical 
space be strong. 

35. Finitely Additive Probability Functions. Max A, Woodbury, University 
of Michigan, Ann Arbor. 

The results m this paper are part of the work in the author’s University of Michigan 
dissertation, “Probability and Expected Values.’’ The work covered by this paper was 
sponsored by the Office of Naval Research A quorum is a family of sets that contains 
with each pair of disjoint sets also their union and also the complement of any of its sets. 
Trivially a quorum is required to contain at least one set and hence at least the universe set 
or population and the empty set. An extension of the notion of a finitely additive prob¬ 
ability measure function to quorums is given and proved to be equivalent to the usual 



142 


ABSTKACTS OF PAPEIIS 


definition in case the quorum is a field of sets. The extension of a quorum of Hefs rclalive 
to tile probability measure function is investigated using the properties of the iuticr and 
outer measure The upper and lower integrals are defined and a condition for the existence 
of the integral is given When the quorum is a field it is shown that integral.iUly of a 
function implies the existence of the distrihution function. This last result is well known 
in the case where the probability measure function is completely additive. 

36. On Inverting a Matrix via the Gram-Schmidt Orthogonolixation Process. 
Max a. Woodbury, University of Michigan, Ann Arbor, 

The application of the classical Gram-Schnudt orthogonaliiation process to (he fae- 
torization of a correlation matrix is accomplished by considering the inner product (x, pi «• 
E{xy) in the linear space determined by the statistical variables , Xi , • • • , x„ . In this 
way a representation of the original set of statistical variables in terms of an orthonormal 
set is obtained. (By an orthonormal set we mean a set fi , , A such that A'(f i£,) » 0 

iov i ^ ] and = 1.) The matrix of coefficients B •» (bo), where xt “ 2/^ b,i(, , has 

the property that C = BB' where 0 = and ' denotes the transpose. I'lirllicr the 

matrix B is triangular hence B~' is readily computed, from which one obtains at once 
The quantities b,, are readily obtainable by the method of determinants 
(Dwyer and Waugh, Annals of Math, Sial., Vol. 16 (1946), pp. 269-271, cf. pg. 204) formerly 
called the method of multiplication and subtraction with division. 

37 Certain Properties of the Multiparameter Unbiased Estimates, fi. E. 

Seth, Iowa State College. 

If 9* = (9* , 9? , ■ • • , sf) is an unbiased vector estimate of 9 - (9i , ffj , • ■ • , 9,) in the 
density function p(ii, , ■ • , *»: 9i, 9j, • • • , 9,) having the smalloBt conccntralltm 

ellipsoid among the class of unbiased estimates of 9, and further if < is any statistic of q 
components having B(«) « 0 and finite covariance matrix, then e is unoorrolated with 0*. 

If a sot of sufficient statistics (Ti, Ta, • • • , Tp), p £ q, exists for eslimaling B, then 
corresponding to any unbiased vector estimate ip* of 9, there exists an unbiased estimate 
of 9 depending on Ti , Tj, , Tp alone, where the latter has a concentration oUipsoid 

equal to or contained in that of the former. 

When g = I, andcfi* has the smallest variance among the class c formed by unbiased esti¬ 
mates of 9 which are functions of 9* having a finite variance, and the set of polynomials 
with respect to the distribution funotion of iji* is complete, then is the only clement in 
the class c. For g > 1, the result holds when the '‘variance'’ is replaced by the "concentra¬ 
tion ellipsoid,” 

38 A Class of Lower Bounds for the Variance of Point Estimates. DoyotAa 
Chapman, University of California, Berkeley. 

A class of lower bounds for the variance of point estimates is derived by moans of the 
calculus of finite differences under very weak restrictions and it is shown that they give 
valid lower bounds for certain parameter estimation problems for which the Oramdr-IUo 
formula is invalid. In some oases even when the latter lower bound exists a sharper lower 
bound may be found in the class here defined. On the other hand when it exists, the Cramdr- 
Rao lower bound is asymptotically superior to any of this class. 

39, Standard Errors and Tests of Significance for Interpolated Medians. 
ChuhchiijL Eisenhart and Miri.am L, Yevick, National Bureau of Stand¬ 
ards. 



ABSTBACTS OF PAPERS 


143 


If a sample of N observations is grouped by a sequence of class intervals with boundaries 
— 00 , • . , , a:_i j Ko i >■•■>+“ i where *0 is the largest boundary point for which 

the observed ‘fraction below’, pa, is less than^, and aii is the smallest boundary point for 
which the observed ‘fraction above’, px , is less than i, so that the observed ‘ central frac¬ 
tion', Pc , between xo and Xi is positive, then, at least for the case of N large, standard text¬ 
books take as the median of the grouped data the interpolated median. 


where 


pi *= *0 + b(iPi — ®o) 
6 = (} - Pa)/pa ■ 


The literature is silent regarding the sampling properties of such medians, and regarding 
tests of significance appropriate to them. Let Pa and Pc be the population fractions 
below Xo , and between xo and xi , respectively, and let u and 0 be the population analogs 
of m and b obtained by replacing pa and poin the above equations by Pa and Pc , respec¬ 
tively. It is shown that m is asymptotically normally distributed about u so defined with 
asymptotic variance given by 


— [Pfld - Pa) - WPaPc + - Pc)l 

lY J p 


where 

p 

Yc — - - — = ordinate of ‘central rectangle’ of ‘population histogram'. The 

Xi - ®o 

classical formula for the variance of a median can be obtained as the limit of the above 
when (ii — Xo) —r 0 with Pa —>■ i. 

In addition, testa of hypotheses regarding the value of the ‘interpolated median of the 
population’, u, and regarding the difiference, uj — Wi , of the interpolated medians of two 
populations, are developed (1) by utilizing the above asymptotic results, and (2) by utiliz¬ 
ing the Neyman-Pearaon hkelihood-ratio-teat approach. 


40. Some Eflficient Range-Estimates of Variation. Kilan Norris, Hunter 
College, New York. 

The commonly used sample range (in the sense of the difference between the largest and 
smallest of the variates) is one of an unlimited number of range or difference-measures 
which can be used to scale parent populations For samples drawn from a Type III udi- 
verse, the maximum-likelihood estimate of dispersion is given by A — G, where A is the 
sample arithmetic mean and G is the sample geometric mean. For samples drawn from a 
Type V universe, a 100% efllcient estimate of absolute variation is given by G — LT, where 
G is the sample geometric mean and If is the sample harmonic mean. Under certain general 
conditions usually fulfilled, the standard errors of both of these range-measures of absolute 
dispersion may be estimated from expressions obtained by application of the Laplaoe- 
Liapounoff theorem. The two parametric methods of estimating absolute variation as 
developed in this paper are likely to be most useful when the form of the parent universe 
is known, and it is cither too expensive or impossible to obtain samples large enough to 
permit the use of inefficient estimates. An example of such a case is the learning curve 
encountered in. the analysis of frequency of occurrence of aircraft accidents by hours of 
flying experience of pilots in training. E. J. G. Pitman, free. Camb Phil, Soc,, Vol 33 
(1937), pp, 217-218, has discussed the scaling of the Type III distribution. The method 
of sealing given by Pitman differs from the method of estimation developed in this paper for 
the Type HI universe. 



NEWS AND NOTICES 

Readers are invited to submit to the Secretary of the Jmlitvie news items of interest 

Personal Items 


Dr. Franz L. Alt has resigned his position with the Ballistic Rc.soarch Ijibora- 
tories at Aberdeen to join the National Bureau of Standards where he is in charge 
of the Computation Laboratory of the “National AppHed Mathematics 
Laboratory.” 

Dr. Edward W. Barankin has been promoted to Assistant Professor and Re¬ 
search Associate at the Statistical Laboratory, University of California, Berkeley, 
California. 

Dr. Stanley Clark has accepted an associate professorship of Education at 
the College of Education, University of Saskatchewan, Saskatoon, Canada. 

Dr, Gerald J. Cox has resigned his position as Research Chemist in the Chemi¬ 
cal Division of Corn Products Refining Co., Argo, Illinois to accept an appoint¬ 
ment as Professor of Dental Research in the School of Dentistry of the Uni¬ 
versity of Pittsburgh 

Mr. S. Lee Crump has resigned his assistant professorship at Iowa State Col¬ 
lege to accept a position in the Atomic Energy Project, University of Rochestor. 

Dr JohnH. Curtiss, Chief of the National Applied Mathematics Laboratories 
of the National Bureau of Standards, has assumed temporary additional duties 
as Acting Chief of the Institute for Numerical Analysis, The Institute for 
Numerical Analysis, located on the U.C.L.A, campus, was established by the 
National Bureau of Standards with the support of the Office of Naval Research 
and the United States Air Force for the two-fold purpose of pursuing mathemati¬ 
cal research aimed at the development of numerical techniques for the full ex¬ 
ploitation of the newer large-scale electronic computing machines and for per¬ 
forming numerical computations basic to the extension of the frontiers of science, 

Mr. Walter T. Federer has resigned his position at the Statistical Laboratory 
at the Iowa State College to accept a position as Professor of Biological Sta¬ 
tistics in the Department of Plant Breeding at Cornell University. 

Dr John Gurland, who received his Ph.D. in mathematical statistics from the 
University of California in August, 1948, is now a Benjamin Pierce lastructor in 
Mathematics at Harvard University. 

Dr. Joseph L. Hodges, Jr. has been promoted to Instiuctor and Research As¬ 
sociate at the Statistical Laboratory, University of California, Berkeley, 

Dr Cyril J. Hoyt has resigned his position as Research Associate with the* 
Department of Education at the University of Chicago to accept an appointment 
as Associate Director of the Bureau of Educational Research, University of 
Minnesota. 

Dr Tjallmg C. Koopmans has been promoted to Professor of Economics at 

144 



NEWS AND NOTICES 


145 


the University of Chicago and also Director of Research of the Cowles Com¬ 
mission for Research in Economics 

Dr Eugene Lukacs, formerly at Our Lady of Cincinnati College, has accepted 
a position as statistician at the United States Naval Ordnance Test Station at 
Inyokern, California 

Mr, Frank Jones Massey, Jr., who has been in the Department of Mathematics 
at the University of Maryland, has accepted an assistant professorship in the 
Department of Mathematics at the University of Oregon, Eugene, Oregon. 

Miss Judith Moss has resigned her position at the National Bureau of Eco¬ 
nomic Research and is now with the Port of New York Authority as an Eco¬ 
nomic Analyst in the Planning Bureau. 

Dr Richard Otter has accepted an assistant professorship in the Department 
of Mathematics at the University of Notre Dame. 

Dr. Nathan Grier Parke, III has been appointed Research Fellow of the 
Massachusetts General Hospital and Associated Research Director of the Har¬ 
vard Piatric Study. 

Dr. Joseph A. Pierce is now serving as Chairman of the Division of Natural 
Science and Mathematics at the Texas State University for Negroes, Houston 
4, Texas. 

Dr. Saul B. Sells, former Assistant to the President of the A. B. Frank Co. of 
San Antonio, Texas, has joined the staff of the Department of Psychology of the 
Air University, School of Aviation Medicine, Randolph Field, Texas. 

Dr. Otis A Pope, who was with the Office of Foreign Agricultural Relations, 
U. S. Department of Agriculture, Technical Collaboration Branch, Wash¬ 
ington, D. C., died September 28th, 1948. 


Special Summer Session in Survey Research Techniques 

The Survey Research Center of the University of Michigan will hold its special 
summer session in Survey Research Techniques from July 18 to August 13,1949. 

The following courses will be offered: Introduction to Survey Research, Survey 
Research Methods, Sampling Methods in Survey Research (introductory and 
advanced), Mathematics of Sampling, Statistical Methods in Survey Research, 
Techniques of Scaling. 

In addition the introductory courses will be given from June 20 to July IG. 
This will permit students who are attending the full eight-week summer session 
of the University (June 20 to August 13) to register for the introductory courses 
during the first four weeks. 

It is expected that thas special session will attract men and women employed 
in market research or other statitical work and university instructors and gradu¬ 
ate students with a particular interest in this area of social science research. 

All courses are offered for graduate credit and students must be admitted by 



146 


NEWS AND NOTICES 


the Graduate School. Inquiries should be addressed to the Survey Research 
Center, University of Michigan, Ann Arbor, Michigan. 

I _ 


Summer Courses in Statistics at Michigan 

In addition to the special courses in Survey Research Techniques, the follow¬ 
ing courses of special interest to students of statistics are among those offered 
by the mathematics department of the University of Michigan in the Summer 
Session, June 20 to August 13: Finite Differences (Fischer), Probability (Cope¬ 
land), Theory of Statistics I and II (Carver), Significance Tests (Dwyer), Com¬ 
putational Methods (Dwyer), Theory of Estimation and of Significance Tests 
(Craig) and Seminar (Craig). 


The International Congress of Mathematicians 


No summer meeting of the Institute of Mathematical Statistics is planned for 
1950 because of the meeting of the International Congress of Mathematicians 
which will be held in Cambridge, Massachusetts August 30 to September 6,1950- 
The following statement has been prepared by the organizing committee: 

An International Congress of Mathematicians will be licld in Cambridge, Mas¬ 
sachusetts, in 1950 under the auspices of the American ^Mathematical Society, 
The Society originally planned to act as host for a Congress in September, 1040, 
which was also scheduled to meet in Cambridge. At the 1936 Congress in Oslo, 
Norway, the invitation for the 1940 Congress was issued by the American dele¬ 
gation in the name of the American Mathematical Society. Plans for the 1940 
Congress were practically completed when the outbreak of World War 11 in 
September, 1939, made it necessary for the Society to postpone the Congress to 
a more favorable date. An Emergency Committee was established to carry on in 
the interim and, on recommendation of this Committee, the Council of the 
Society voted to hold the Congress in 1950. 


The 1950 Congress will be the third International Congress of Mathematicians 
to be held on the continent of North America. The first was held at North¬ 
western University in 1893 and the second at the University of Toronto in 1924, 
International Congresses were held at intervals of appro-ximately four years! 
except when war intervened, until 1936. There has been no international gath¬ 
ering of mathematicians since that time and it is the sincere hope of the Or¬ 
ganizing Committee that the gathering in 1950 will be a truly international one, 
that the American mathematicians will attend in large numbers, and that all 
other countries will be well represented. The Council of the American Mathe¬ 
matical bociety has voted unanimously to hold a Congress which will be open to 
mathematiciaiis ox all national and geographical groups. 

Tme and Place. The dates for the Congress have been fixed as August 30- 
September 6. 1950. Harvard University will be the principal host institution. 
A number of other institutions in metropolitan Boston will join in the entertain¬ 
ment of Congress visitors by arranging special features on their campuses. 



NEWS AND NOTICES 


147 


Tyfe of Congress. In recent years mathematicians have been much impressed 
by the success of the conference method for presenting recent research in fields 
where vigorous advances have 3 ust been made or are in progress. In view of the 
success of mathematical conferences on special topics which have been held in 
Russia, France and Switzerland and, more recently, at the Princeton Bicentennial 
Celebration, the 1950 Congress will include Conferences in several fields. For 
the 1940 Congress, Conferences in four fields had been planned. The number of 
Conferences was thus restricted lest the introduction of a promising and novel 
feature result in failure through the dissipation of interest and energy. 

Following the established custom, the Organizing Committee plans to have a 
number of invited hour addresses by outstanding mathematicians. In addition, 
sectional meetings for the presentation of contributed papers not included in 
Conference programs will be held in the following fields; I, Algebra and Theory 
of Numbers; II, Analysis; III, Geometry and Topology; IV, Probability and 
Statistics, Actuarial Science, Economics; V, Mathematical Physics and Applied 
Mathematics; VI, Logic and Philosophy, VII, History and Education. 

The official languages of the 1950 Congress will be English, French, German, 
Italian, and Russian. 

Organization. The plans for the Congress are under the supervision of an 
Organizing Committee which was elected by the Council of the American Mathe¬ 
matical Society in February, 1948. The Chairman is Professor Garrett BirkhofE 
of Harvard University and the Vice Chairman is Professor W. T. Martin of 
Massachusetts Institute of Technology. Other members of the committee are: 
Professors J L. Doob, G. C. Evans, J. R. Kline, Solomon Lefschetz, Saunders 
MacLane, Dean R G. D. Richardson, Professors Oswald Veblen, J. L. Walsh, 
D. V, Widder, Norbert Wiener, and R. L. Wilder. 

Many of the subventions promised for the 1940 Congress are still available. 
A Financial Committee under the chairmanship of Professor John von Neumann 
is endeavoring to secure additional funds. Besides support from Harvard Uni¬ 
versity and Massachusetts Institute of Technology, generous subventions have 
been subscribed for the Congress by the Carnegie Corporation, the Institute for 
Advanced Study, the National Research Council, and the Rockefeller 
Foundation. 

An Editorial Committee under the chairmanship of Professor Salomon Bochner 
will assume responsibility for the pubheation of the Proceedings of the Congress. 

Professor J. R Kline of the University of Pennsylvania has been napied Secre¬ 
tary of the Congress and Dr. R. P Boas, Executive Editor of Mathematical 
Reviews, has been designated Associate Secretary. 

Entertainment. Harvard University has offered the use of its dormitories and 
dining rooms for mathematicians and their guests for the period of the Congress. 
The Organizing Committee hopes that it will be possible to furnish room and 
board without charge to all mathematicians from outside contmental North 
America who are members of the Congress. Congress membership fees and rates 
for room and board will be announced well in advance of the opening of the 
Congress. 



NEWS AND NOTICES 


148 

The Entertainment Committee, of ivliich Professor L. H. Loomis of Han-arcl 
University is Chairman, is planning many interesting features, including a re¬ 
ception, garden party, symphony concert, and banquet, It is hoped that Amer- 
can mathematicians will he able to assist in the entertainment by putting their 
automobiles at the disposal of the Entertainment Committee for trips to be 
made out of Cambridge. 

Every effort will be made to facilitate the travel at reasonable cost of foreign 
participants while in the United States. Previous to the Congress, opportunity 
will be given them to see New York City under the guidance of some mathe¬ 
maticians. 

Information. Detailed information will be sent in due course to individual 
members of the American Mathematical Society and to foreign mathematical 
societies and academies. Others interested in receiving information may file 
their names in the office of the Society, and such persons will receive from time 
to time information regarding the program and arrangements. 

Communications should be addressed to the American Mathematical Society, 
531 West 116th Street, New York City 27, U. S. A. 


New Members 

The following persons have been elected to membership in Ihc Inslilulc 
(August 16,1948 to Novernbor 30, 1948) 

Alman, John E., M.A. (Glaroraont Colleges) Insfciuotor in Mathematics, College of Liberal 
Arts, Boston University, m Gardner Road, Brookline 40, Massachxmih. 

Andrian, Jane F., M S. (Western Reserve Univ.) Graduate sturleiit at University of Cali¬ 
fornia, ISSS G Ashby Avenue, Berkeley California. 

Arbuckle, Richard A., B.S, (Baldwin Wallace College) Resonrch-IinJustrial Fellow at 
Purdue University, F Fh.A. SSO-S Airport Road, Lafayelle, Indiana. 

Barankin, Edward W,, Ph.D. (Univ, of Calif) Assistant Professor of Mathematies and 
Research Associate in Statistical Laboiatory, University of California, Borkelev 
California ‘ ’ 

Blum, Julius R., Student in mathematioal statistics at the University of California !9S7 
Acton Street, Berkeley B, Cahforma. ’ 

Bronfenhrenner Mrs. Jean. M.A. (Univ. of Chicago) Research Assistant, Cowles Coni- 
miss'on, University of Chicago. Chicago 37, Illinois 

Bums, Loren V., B.S (Washburn College, Topeka, Kansas) Tcobnical Director. MFA 
Milling Co., Box iSSS S.S.S., Springfield, Missouri. 

Clement, Edwin G M.B.A (Univ. of Chicago) Captain, Chief of Maiiagemont Control 
ton Andrews Air Force Dose, Wnahing- 

Cramer, George F., Ph.D, (Univ, of Missouri) Mathematician, U, S Navy Department 
Washington, p_C , liB Quincy Slreet, Cheoy Chase U, Maryland. ’ 

Began, James W., A _B (Uaiv of Cfueago) Research Assistant, Psychometric Laboratory 
Umversity of Chicago, im East Blst Street, Chicago ST, niiuois - 

Dodd, Stuart C., Ph.D. (Princeton) Research Professor of Sociology and Director of Public 
Opinion Laboratory, 47BS-~4Sth Avenue, N.E., SeatUe 5, mshington 

Donnehy, Tom G., M A (Queen's Univ ) Graduate student at the University of North 
CaiQlina, Room m "S", Chapel Hill, North Carolina 



NEWS JLND NOTICES 


149 


Edwards-Davles, Harold D , Special Lecturer, Department of Mathematics, Dalhousie 
University, 67 Seymow Street, Halifax, N.S., Canada, 

Ellner, Henry, Ch E. (College of City of New York) Statistician (Physical Sciences) 
I-C Oak Orove Drive, Baltimore SO, Maryland. 

Feigenbaum, Armand V., MS. (Mass Institute of Tech.) General Electric Company, 
Room 257, Building 23, Schenectady, New York 

Festlnger, Leon, Ph D (Univ, of Iowa) Assistant Professor of Psychology, Research Cen¬ 
ter for Group Dynamics, University of Michigan, Ann Arbor, Michigan. 

Frame, James S., Ph D (Harvard) Professor and Head of Department of Mathematics, 
Michigan State College, Lansing, Michigan. 

French, Benjamin J., M.Ed (Univ of New Hampshire) Examiner, Educational Testing 
Service, Matthews Road, Keene, New Hampshire. 

Gaffey, William R., A B. (Univ. of Calif.) Research Assistant, University of California, 
SS06 Grant Street, Berkeley 4, California 

Goodman, Leo A., A B (Syracuse University) Research Assistant in Mathematical Sta¬ 
tistics and Graduate student at Princeton University, Fine Hall, Princeton Univer¬ 
sity, Princeton, New Jersey 

Hader, Robert J., Ph D (North Carolina State College) Instructor and Research As¬ 
sistant, Institute of Statistics, North Carolina State College, Raleigh, North Carolina., 

Haley, Kenneth D., M S (Stanford Univ ) Assistant Professor of Mathematics, Acadia 
University, Wolfville, Novia Scotia, Canada 

Kahn, Louis B., M S (Univ. of Wisconsin) Reseaich Associate, University of Wisconsin, 
Box 16-F, Badger, Wisconsin. 

Katz, Irving, B.S. (College of City of New York) Statistician, Strategic Air Command, 
S79 — S7 Place, S E., Washington 19, D. C, 

Klentzle, Mary J., Ph D. (Univ of Ill ) Assistant Professor of Psychology, Department of 
Psychology, Washington State College, Pullman, Washington 

Kodltschek, Paul, LI. D (Univ, of Vienna) Research Associate, Scientific Research Serv¬ 
ice, Columbia University, S19 W. ISth Street, New York 14, New York. 

Levin, Howard S., S B. (Univ, of Chicago) Electronic Engineer, Glenn L. Martin Co., 
BSS Addison Street, Chicago IS, Illinois 

Levine, George J., B.S. (Brooklyn College) Actuarial Mathematician, 5109—Isl Street, 
North, Arlington, Virginia. 

Livennan, J. G., B.A. (Cantab) Civil Servant, Ministry of Fuel and Power, SI Ascot Court, 
Grove End Road, London, iV.W. 8, England 

Loeve, Michel, Ph.D. (Sorbonne, Paris) Professor and Research Associate m Statistical 
Laboratory, Durant Hall, University of California, Berkeley, California 

Loo, Chlng-Tsu, Ph.D. (Univ of Chicago) Research Associate, Statistical Laboratory, 
University of California, Berkeley, California 

Lubin, Ardle, B S. (Univ. of Chicago) Statistician, Psychology Department, Maudsley 
Hospital, Denmark Hill, S,E. 5, London, England. 

Moses, Lincoln E., A.B. (Standord Univ.) 7 Perry Lane, Menlo Park, California. 

Moutler, Edith, Lieenoio-iiB-scicnccs (Univ. of Caen, France) Teaching Assistant, Statisti¬ 
cal Laboratory, University of California, Berkeley, California. 

Osborne, Ernest L., L L B (LaSalle Univ ) Economic Analyst, Department of the Army, 
Chancery Apartments, SISO Wisconsin Avenue, N.W,, Washington 16, D. C. 

Pabst, William R. Jr., Ph.D (Columbia Univ.) Quality Control Division, Bureau of Ord¬ 
nance, Navy Department, 64^0 Quebec Street, N.W., Washington 16, D. C. 

Plackett, Robin L., M A (Cambridge, England) Lecturer in Mathematical Statistics, 
Department of Applied Mathematics, The University, Liverpool 3, England. 

Proschan, Frank, M.A, (George Washington Univ.) Research Analyst, 16S7 B. St , iV.W,, 
Washington 9, D. C 



150 


REVISION OF BT-LAWS 


Rau, A. Ananthapadmanabha, MS, (Iowa State College) Statistician and Agrirultural 
Meteorologist, Department of Agriculture, Bangalore, Mvsore State, India 

Rees, Mina, Ph D, (Univ ot Chicago) Head, Mathematics Branch, Office of Naval lie- 
search, B2719, T-3 Building, Washington 26, D. C, 

Roberts, Spencer W. Jr., M.S, (Univ. of Michigan) Research Associate, University of 
Michigan Department ot Engineering Research, SOS Tliompion Street, Ann Arbor, 
Michigan 

Sanna, S. C., M So (Calcutta Univ) Graduate student in mathematical statialies at 
Columbia University, !JS0 John Jay Wall, Columbia University, Xew York HT, 
New York, 

Schnelderman, Maivln A., B.S (College of City of Now York) Statistician, liiological. Na¬ 
tional Institute of Health, T-6, 2216, Bethesda, Maryland. 

SchuII, WilUam Ph.D, (Ohio State Univ.) Student at Ohio State University, Depart¬ 
ment of Zoology, Ohio State University, Columbia 10, Ohio. 

Schweld, Samuel, B.S S. (College of City of New York) Statistician, Industry Diviflion, 
Bureau of the Census, lilO Momoc Street, AMP., Washington 10, D. C. 

Wallace, David L., B.S. (Carnegie Institute of Tech.) Graduate Student and Teaching 
Assistant in Mathematics, Carnegie Institute of Technology, ISS Lawrence jlrcnue, 
Homestead Park, Pennsylvania, 

Williams, Evan James, B.C. (Univ. of Tasmania) Research Officer, Section of Mathe¬ 
matical Statistics, Division of Forest Products, C S.I.R., P 0. Bo.’t 18, South MeU 
bourne, S.C 4, Australia 

Zavrotsky, Andres, Head of the Statistical Department of the Venezuela Office (or Social 
Insurance, Mercedes a Luneia S3, Caracas, 

Correction of New Members In June, 1948 Issue! 

Lolzeller, Enrique Blanco, should be written as follows: 

Blanco Lolzeller, Enrique. (PhD.) Professor of Statistics, Economics Faculty, Madrid 
University, Spam, Nenion No. Madrid, Spain, 


ELECTION OF OFFICERS AND COUNCIL AND REVISION OF BY-LAWS 

At the membership meeting held at Cleveland on December 28, the folloM-ing 
officers and members of the Council were elected: 

President: 

President-Elect: 

Comal; 

3-year term ^ 


2-year term < 


1-year term < 


J. Neyman 
J. L. Doob 
(W. G. Cochran 
C. Eisenhart 
H, Hotelling 
,A. Wald 
W. Feller 
P. G. Hoel 
H. Scheff4 
J. Wolfowitz 
Gertrude Cox 
M. A. Girshick 
J W. Tukey 
[J. von Neumann 



HEPORT ON SEA-TTLE MEETING 


151 


The By-Laws were also revised and further action was taken. More detailed 
accounts of this meeting will be sent directly to the members. 

Paul S. Dwyer 
Secretary 


REPORT ON THE SEATTLE MEETING OF THE INSTITUTE 

The thirty-sixth meeting and fourth Regional West Coast meeting of the 
Institute of Mathematical Statistics was held in Seattle, Washington, November 
2&-27, 1948. The sessions of November 27, 1948 were held jointly with the 
Biometric Society (Western N. A, Region). The meeting was attended by 91 
persons, including the following 22 members of the Institute: 

F. C Andrews, E W Barankin, Z. W. Birnbaum, A. H. Bowker, D G. Chapman, R. C. 
Davia, W. J Dixon,E. Fay, M, A. Girahick,P.Horst,H. M. Hughes,!. C. H.. Li,F. Maaaey, 
J. Neyman, E Paulson, Elizabeth L Scott, Eather Seidcn, M. Sobel, Z. Szatrowaki, J. R. 
Vatnsdal, J E. Walah and Zivia S. Wurtele. 

At the morning session on November 26, Professor R. M. Winger of the Uni¬ 
versity of Washington as chairman welcomed those attending the meetings, and 
the following program of contributed papers was presented: 

1, Estimation of the Variance of the Bivariate Normal Distribution. 

Harry M Hughes, University of California 

2. Derivation of a Broad Glass of Consistent Estimates 

R C Davis, NOTS, Inyokorn, California. 

3 Locally Beal Unbiased Eatimalea. 

Edward W. Barankin, University of California, 

4 Some Problems Related to the Distribution of a Random Number of Random Variables. 

Edward Paulson, University of Washington. 

6 Asymptotic Expansions for the Distribution of Certain Likelihood Ratio Slalialics. 

Albert H. Bowker, Stanford University 

6. On a Problem of Confounding in Symmetrical Factorial Design. 

Esther Seiden, University of California. 

7. Some Bounded Significance Level Tests of Whether the Largest Observations of a Set are 

Too Small. 

John E, Walah, Project RAND, Douglas Aircraft Corp., Sauta Monica, Calif 

The afternoon session of November 26, under the chairmanship of Professor 
J. Neyman of the University of California at Berkeley, had the following 
program: 

1. Invited paper: 

Multiple Decision Functions. 

M, A. Girshiok, Stanford University. 

Contributed papers: 

2. Determination of Optimal Tesl Length to Maximize the Multiple Correlation Coefficient. 

Paul Horst, University of Washington, 

3. Some Numerical Comparisons of a Non-Paramelric Test with Other Tests. 

F. J. Massey, University of Oregon. 



152 


REPORT ON CIjEVEt-AND MEETING 


4, On the Deviation of Exheme Values 
W J Di’ion, University of Oregon 

5 The Optimum She of Interval for Making Measurements of a Rocket’s Angular Vclocilg. 

Edward A Fay, Univcisrty of Galifoinm. 

6. Stalionaty Time Series Annh/sis and Common Slock Price Forecasting. 

Zenon Szatrowski, University of Oregon. 

At the morning session of November 27, tvilli Professor W. F. Thompson of the 
University of Washington as chairman, the program consisted of the following 
papers: 


1 Invited paper' 

On the Place of Siatislics in Fishery Biology 

Willis S Rioh, Stanford University and U S Fiah and Wildlife Service. 

Contributed papers. 

2. Distribution of the Number of Schools of Fish Caught per Boat. 

J Neyman, University of California. 

3 Some Problems in Fishery Research to which Statistical Methods are App licable. 

Ralph Silliman, TJ. S. Fish and Wildlife Service, Seattle, Washington. 

4. The Application of the Hypargeometric Dialrtbultnn lo Problems of EsUmating and Com¬ 
paring Zoological Population Sizes. 

Douglas Chapman, University of California. 

5 Extension to Multivariate Case of Neyman's Smooth Test. 

Elizabeth L Scott, University of California. 

6. A Mathematical Theory of Vitamin A Metabolism tn Fish. 

Norman E Cooke, Paoilio Fisheries Experimental Station, Vancouver, B.C. 

The afternoon session of November 27 was held under the chairmanship of 
Professor F. W. Weymouth of Stanford University, with the following program: 

1. Invited paper. 

Statistical Problem of Enumeration of Fish Eggs tn the Sea. 

Oscar E. Sette, U, S. Pish and Wildlife Service, San Francisco. 

Contributed papers 

2. The Interaeiance Hypothesis 

Stuart C. Dodd, University of Washington. 

3 The Employment of Marked Members in Estimation of Animal Populations. 

Milner E. Schaefer, Stanford University 

4 Non-Response and Repeated Call-Back!, in Opinion Polls. 

Z W Birnbaum, University of Washington. 

5 Statistical Fi oblems Relating to Fishei ies. 

J L. Hart, Pacific Biological Station, Nanaimo, B. C. 

at C:30 o’clock there was a dinner for members and guests 
at the Edmond Meany Hotel. 

Z. W. Birnbaum 


REPORT ON THE CLEVELAND MEETING OF THE INSTITUTE 

Meetmg of the Institute of Mathematical Statistics was 
at the Statler Hotel, Cleveland, Ohio, on December 27-30, 1948. The 



HEPORT ON CLEVELAND MEETING 


153 


meeting was held in conjunction with the Annual Meeting of the American 
Statistical Association. The following 176 members of the Institute were in 
attendance: 

P. H. Anderson, E L. Anderson, L. W. Anderson, Max Astrachan, G J. Auner, T A. 
Bancroft, B. Geoffrey, Z. W. Birnbaum, Archie Blake, E. E. Blanche, C I, Bliss, Dorothy 
S. Brady, A E. Brandt, G. W. Brown, T. H. Brown, M. A Brumbaugh, P. T. Bruybro, K. W. 
Burgess, I.W. Burr, J. M. Cameron, A. G Carlton, Harry Carver, P. E. Celia, Uttom Chand, 
E A. Chapman, Edmund Churchill, Herman Ghernoff, W. G. Cochran, Jerome Cornfield, 
J. H. Cover, Gertrude M. Cox, C. C Craig, S. L. Crump, J. H Curtiss, D. A. Darling, W. 
L. Deemer, D. B DeLury, W E. Deming, Philip Desind, H. F. Dorn, 0. W. Dunnett, P. S. 
Dwyer, Churchill Eisenhart, Benjamin Epstein, C. D. Ferns, Leon Festinger, C. H. Fischer, 
J. C Flanagan, M M. Flood, L E. Frankel, D A. S. Fraser, H. A. Freeman, Milton Fried¬ 
man, H, C Fryer, E. P. Gardner, E. S Gardner, H, H. Germond, William Gomberg, E. L, 
Green, S. W. Greenhouse, J. Gurland, E, J Hnder, K. W. Halbert, H J. Hand, M, H. Han¬ 
sen, T. E. Hams, Boyd Harshbarger, P. M. Houser, J F Hofmann, Harold Hotelling, A. S. 
Householder, E E Houseman, Helen M Humes, C. C. Hurd, C. M Jaeger, E. J. Jessen, 
H. L Jones, Irving Katz, Leo Katz, Harriet J. Kelly, 0. Kempthorne, A. W. Kimball, Jr., 
A. J. King, Leslie Kish, L. A Knowler, Lila F. Knudsen, C. F. Kossack, O. E. Lancaster, 
Marvin Lavin, S. B Littauer, Irving Lorge, F. W. Lott, Jr,, Eugene Lukacs, P. J. McCar¬ 
thy, C. J. Maloney, John Mandel, Nathan Mantel, H. B. Mann, E. S. Marks, Margaret 
Merrell, Helen Michaels, E B. Mode, A M Mood, Nathan Morrison, Dorothy J. Morrow, 
J, W. Morse, J. E. Morton, Jack Moshman, Frederick Hosteller, B. D. Mudgett, Hugo 
Muenoh, M E. Neifeld, E. H. Noel, G. E. Noether, J. I. Northam, H. W Norton, J. A. 
Norton, Jr , E. G, Olds, P. S. Olmstoad, Bernard Ostlo, A. E, Pauli, Paul Peach, M. P. 
Peisakoff, E W. Pike, E. J. G Pitman, R. A. Porter, J. A, Rafferty, L. J. Reed, Olav Reier- 
Bol, William Eeitz, F. D. Rigby, A, G Eosander, Herman Rubm, Erik Euist, P. J. Rulon, 
Max Sasuly, F. E. Satterthwaite, L, J. Savage, Mary Ann Savas, Marvin Sohniederman, 
Elizabeth Scott, G. R. Seth, Jack Sherman, S. S, Shrikhande, C, E. Simms, J. H Smith, 
G. W. Snedecor, Mortimer Spiegelman, B. E. Stauber, F. F. Stephan, Joseph Steinberg, 
J. V. Sturtevant, B. J. Tepping, W. R. Thompson, J W Tukey, Jan Vchytil, W. E. Van 
Voorhis, D. F. Votaw, Jr , F, M, Wadley, Helen M. Walker, D. L. Wallace, W. A. Wallis, 
G. S. Watson, Leonel Weiss, Samuel Weiss, E L Welker, M, E. Wescott, Phillips Whidder, 
D,R. Whitney, S.S Wilks, C P.Winaor, Gerald Winston, M A Woodbury, T. D. Woolsey, 
Holbrook Working, W. J. Youden 

The first session, a joint session with the American Statistical Association, 
was held at 2:00 P.M. on Monday, December 27, at which time a paper entitled 
Statistical Concepts in an Infinite Number of Dimensions was presented by Pro¬ 
fessor David H Blackwell of Howard University. Professor E. J. G. Pitman 
of the University of Tasmania was chairman. 

The second session of the opening day was devoted to contributed papers in 
mathematical statistics, and was held at 4:00 P.M. in conjunction with the 
American Statistical Association. Professor W. R. Van Voorhis of Penn College 
was chairman. The following papers were presented: 

1, A Necessary Condition for a Certain Class of Characteristic Functions. Preliminary 
report. Eugene Lukacs, NOTS, Inyokern, California and Our Lady of Cincinnati 
College, Cincinnati, Ohio 

2. Precision of Estimales from Samples Selected under Marginal Reslnclions. Preliminary 
report. Clifford J. Maloney, Research and Development Department, Gamp Det- 
riok. Frederick, Maryland. 



154 


KEPORT on CLEVELAND MEETING 


3. Properties of Maximum and Quasi-Maximuvi fAkelihoad Exlvnalcs of Parmnrlers of a 
Sijslem of Linear Stochastic Difference Equations with Serially Correlated Disturbances. 
Preliminary report Herman Hubm, Cowles Commission, University of CliicHgo, 

4 The Computation of Maximum Likelihood Estimates of Parameters of a System of Lt near 
Stochastic Difference Equations with Serially Correlated Dtslurtiane.es. 

Herman Chernoff, Cowles Commission, University of ChicaKO 
5, Test Cnlenafor Hypotheses of Symmetry and Definiteness of a Regression Matrix for 
Demand Functions. 

Uttam Chand, University of North Carolina. 

6 The Dtslribiiiion of Extreme Values in Samples whose Members are Slochaslically De¬ 
pendent. 

Benjamin Epstein, Wayne University. 

A session on Teaching Slaiistical Quality Control was held on Monday evening, 
December 27, jointly with the Ohio Section of the American Society for Quality 
Control and Section on Training of Statisticians of the American Statistical 
Association. Professor Samuel S. Wilks of Princeton University presided at the 
session. The following two papers were presented.: 

1. Teaching Statistical Quality Control for Town and Gown. 

Lloyd A Knowler, State University of Iowa. 

2 Instructional Aids for Statistical Quality Control. 

Edwin G Olds, Carnegie Institute of Technology. 

The session concluded with discussion by Professor Irving W. Burr of Purdue 
University, and Professor Theodore H. Brown of Harvard University. 

A session on Review of Staliskeal Methodology was held jointly with the Ameri¬ 
can Statistical Association at 2:00 P.M,, December 28. Professor Frederick 
Hosteller of Harvard University presided. The following papers were presented: 

1. Surveys and Sampling. 

Philip J McCarthy, Cornell University. 

2. Industrial Applications 

Paul S Olmstead, Bell Telephone Laboratories. 

3. Biology, Physical Sciences and Experimental Design 
W. J, Youden, National Bureau of Standards. 

At 4:00 P.M. on Tuesday, December 28, Professor H. C. Fryer of Kansas 
State College presided at a joint session with the Biometric Society and Bio¬ 
metrics Section of the American Statistical Association, Papers presented were: 

1, Evaluation of Field Insecticides from Count of Survivors. 

C I Bliss and Neely Turner, Connecticut Agricultural Experiment Station. 

2 , Curved Dosage-Response Curves. 

Oscar Kempthorne, Iowa State College, 

3, Statistical Variations in Contents of Dry-Filled Ampuls in Current Pharmaceutical 
Practice 

M. W, Green, American Pharmaceutical Association, and Lda F. Knudaeu, Food and 
Drug Administration. 

4, A Practical Method for Determining the Mean and Standard Deviation of Truncated 
Normal Distributions. 

J Ipsen, Yale University. 



REPORT ON CLEVELAND MEETING 


165 


The session was concluded with discussion by D. B. DeLury, Ontario Research 
Foundation; Lloyd Miller, Sterling-Winthrop Research Institute; C. Eisenhart, 
National Bureau of Standards; J. L. Northam, Kansas State College. 

On Wednesday, December 29, at 2:00 P.M., Dr. W. Edwards Deming presided 
at a session on Effects of Error in the Independent Variate in Regression Problems. 
This meeting was held in conjunction with the Biometric Society and Biometric 
Section of the American Statistical Association. Papers presented were: 

I. Are There Two Regressions? 

Joseph Berkson, Mayo Clinic, 

2 Present Status of the Theoiy. 

Jerzy Neyman, University of California. 

3. The Idenitfiahilily of a Linear Relationship Between Variables which are Subject to 
Error 

Olav Reiersol, Purdue University. 

These papers were followed by discussion by Professor Churchill Eisenhart, Na¬ 
tional Bureau of Standards, Elizabeth L. Scott, University of California, and 
C. P. Winsor, Johns Hopkins University. 

Professor Boyd Harshbarger, of the Virginia Polytechnic Institute, presided 
at the Wednesday afternoon session on contributed papers in mathematical sta¬ 
tistics. Papers presented were: 

1. On Age-Dependent Stochastic Branching Processes, 

Richard Bellman and Theodore E Harris, Stanford University, Palo Alto, Cali¬ 
fornia and tho Rand Corporation, Santa Monica, California. 

2. Cuboidal Lattices. 

G. S. Watson, Institute of Statistics, University of North Carolina. 

3. Tiansformalions Induced by Scries Approximation of Prior Probability Amplitude. 
Archie Blake, Office of tho Surgeon General, U. S. Army. 

4 On the Utilization of Market Specimens tn Estimating Populations of Flying Insects. 
Cecil C. Craig, University of Michigan, 

6, On a Probability Distribution. 

Max A Woodbury, University of Michigan. 

6 Distribution-Free Tests of Data from Factorial Experiments. 

G W. Brown and A. M. Mood, Iowa State College. 

7. On Sums of Symmetrically Truncated Normal Random Variables, 

Frod C. Andrews and Z W, Birnbaum, University of Washington. 

8, On the Foundation of Statistics. 

(By title). Max A. Woodbury, University of Michigan. 

9. Finitely Additive Probability Functions, 

(By title). Max A. Woodbury, University of Michigan. 

10. On Inverting a Matrix via the Gram-Schmidt Orlhogonalization Process. 

(By title). Max A, Woodbury, University of Michigan, 

II, Certain Properties of the MuUtparameler Unbiased Estimates, Preliminary report. 
(By title), Gobind R. Seth, loWii Slate College. 

12 A Class of Lower Bounds for the Variance of Point Estimates. 

(By title). Douglas Chapman, University of California. 

13. Standard Errors and Tests of Significance for Interpolated Medians. 

(By title). Churchill Eisenhart and Miriam L. Yovick, National Bureau of Stand¬ 
ards. 



156 


REPORT OF THE PRESIDENT 


A symposium on Randomness and its Testing occupied tlie 4l00 P.AI. session 
on Wednesday. Dr. Walter A. Shewhart of the Bell Telephone Laboratories 
presided and the following papers were presented: 

1, Survey of Avazlable Tests for Raridomness. 

W. Allen Wallia, University of Chicago, 

2, Power Functions of Tests fot Randomness. 

H B, Mann, Ohio State University. 

3, Power Functions of Nan-Parametnc Tests. 

Ransom Whitney, Ohio State Umvmaity. 

Discussion was led by Bernice Brown, The Rand Corporation; Paul S. Olmstead, 
Bell Telephone Laboratories; E. J. G. Pitman, University of Tasmania. 

The morning session on Thursday, December 30, was a joint session with the 
American Statistical Association, ivith Professor Jerzy Neyman of the University 
of California presiding The following two papers were presented upon invita¬ 
tion of the Institute: 

1. Estimating Linear Resiriclions on Regression Cocfflcienls for MidtivariaU Normal 

Distributions. 

T. W. Anderson, Colmubia University. 

2, Some Aspects of the Theory of Testing Composite Hypotheses. 

E L Lehmann, University of California 

The Business Meeting was held at 10:00 A.M. on Tuesday, December 28, 
Dr Churchill Eisenhart presided. A report of this meeting is found elsewhere 
in this issue. 

W. R. Van Voorujs 
Assistant Secretary 


REPORT OF THE PRESIDENT OP THE INSTITUTE FOR 1948 

The last few years have seen a considerable growth of the Institute, The 
upward trend has continued throughout 1948. The Institute has acquired 126 
new members during the year, bub this gam is to be balanced against losses due 
to resignation and suspension for non-payment of dues. The Institute starts 
J nof membership of about 1,100 as against the membership of 

1,037 at the beginning of 1948. While the net gam is still substantial, it is not 
quite as much as hoped for, and this may serve as an incentive for an increased 
membership drive in 1949. The constantly increasing interest and research 
activities in statistical theory and methodology are well rcdected in our meetings 
and the publications appearing in the Annals. 

Meetings. The growth of the Institute in the past few years lias brought 
about a considerable increase m its various activities. This manifested itself 

7 programs of the meetings held during the 

year 1948. In addition to the usual invited addresses and contributed papers, 
the programs included a considerable number of symposia on various important 



BEPORT OP THE PRESIDENT 


157 


subjects such as the theory of games (Berkeley, June; Madison, September), 
stochastic difference equations (Madison, September), scales of measurement 
(New York, April), sampling for industrial use (Berkeley, June), etc. The 
eleventh summer meeting was held in conjunction with the meetings of the 
American Mathematical Society and the Econometric Society (Madison, Septem¬ 
ber), The eleventh annual meeting (Cleveland, December) was hold in con¬ 
junction with the American Statistical Association, Econometric Society and 
Biometric Society. There were also three regional meetings: New York (April), 
Berkeley (June) and Seattle (November). The Berkeley meeting was held 
in conjunction with the Pacific Division of the American Association for the 
Advancement of Science and some of the sessions of the Seattle meeting were 
sponsored jointly with the Biometric Society. 

To facilitate the organization of meetings and arrangements of programs, 
instead of a single program committee there were three program committees 
appointed, one for Eastern, one for Mid-Western and one for Ear-Western meet¬ 
ings. These committees consisted of the following members. Eastern Com¬ 
mittee; W. G. Cochran, C. Eisenhart (Chairman), P. Mosteller, and J Wolfo- 
witz, Mid-Western Committee: C. C. Craig, H. B Mann, and A. M. Mood 
(Chairman); Far Western Committee: Z. W. Birnbaura, M, A Girshick, P. G. 
Hoel, and J. Neyraan (Chairman). To coordinate the work of these three pro¬ 
gram committees, a coordinating committee was appointed consisting of J. W. 
Tukey (Chairman) and the three chairman of the three program committees. 
This committee was also charged with the responsibility of making recommenda¬ 
tions to the Board of Directors as to times and places for future meetings. 
Another innovation introduced during the past year was the appointment of 
assistant secretaries in connection with the meetings. S. B. Littauer acted as 
assistant secretary for the New York meeting, K. J. Arnold for the summer 
meeting in Madison, Z. W. Birnbaum for the Seattle meeting and W. R. Van 
Vooihis for the Cleveland meeting The assistant secretaries were charged with 
the task of looking after the local arrangements that had to be made in connec¬ 
tion with the meetings. The appointment of assistant secretaries proved to 
be a great success not only in facilitating the necessary local arrangements for 
meetings but also in relieving the burden on the secretary’s office. On the basis 
of this year’s experience, it seems very desirable to continue with this practice 
in the future. 

No Rietz Memorial lecture was given in 1948 in accordance with a decision 
of the Board of Directors that these lectures should not be given every year. 
It is planned, however, to have a Rietz lecture for 1949 and the Board of Direc¬ 
tors invited J. Ncyman to deliver it. 

The New Constitution. One of the major events of the year was the adoption 
of the new constitution at the meeting in Madison. The growth of the Institute 
in recent years made parts of the old constitution obsolete and the need for a re¬ 
vision was apparent. Our thanks are due to the Committee on Planning and 
Development which has devoted much time and consideration to the study of 



158 


REPORT OF THE PRESIDENT 


the problem and prepared a draft of a revised constitution. M. 11. Hansen was 
chairman of this Committee. Other members were: J. H. Curtiss, W. G. 
Cochran, W, Feller, J. Neyman, PI. W. Norton, F. F. Stephan, J. W. Tukey, 
and W. A Wallis. A draft of the new By-Laws ivas prepared by J. IV. Tukey, 
who acted as a subcommittee of the Committee on Planning and development. 

Annals. The growth of the Institute during the past few' yeans ha,s mani¬ 
fested itself also in a constantly increasing number of manuscrijita submitted for 
publication in the Annals While it is very gratifying to see this upward trend, 
it raises some problems of financial nature. At tlie rate manuscripts are com¬ 
ing 111 , an expansion of the publication facilities of the Instituto would .seem 
very desirable. Increase of the volume of the Annals -would, however, mean 
increased cost and the present financial situation of the Institute could not 
allow such an additional burden unless some new sources of income can be found. 
Apart from a possible increase in the cost of printing the Annals, it seems that 
additional expenditures will be necessary for secretarial help in 1949. It was 
decided at the membership meeting in Madison that additional funds be raised 
through the contributions of universities and other organizations with strong 
interest in mathematical statistics and through the contributions of the members. 
Appeals for such contributions were sent out and it is hoped that there ivill be a 
generous response, 

The new constitution permits the appointment of responsible Associate Edi¬ 
tors. This brings up the whole question of editorial set-up and policies. A 
committee with S. S Wilks as chairman was appointed to make a tliorough study 
of the Institute’s publication experience and to make recommendations as to 
publication policies and editorial set-up. Other members of this committee are; 
W G. Cochran, W. Feller, M. A. Girshick, J. Neyman, P. S. Olmstead, W. A. 
Wallis and J Wolfowitz. The committee gave much thought and considera¬ 
tion to the problems involved and will report to the newly elected officers and 
Council. 

The Annals has developed under the leadership of the Editor, S. S. Wilks, 
to one of the outstanding professional journals. I am sure that I can speak for 
all our members in expressing the Institute’s indebtedness to S. S. Wilks for his 
untiring and most successful work. 


Comnvillees. The problem of classification of statisticians in the Government 
service is naturally of considerable importance to the statistical profe-seion. A 
committee consisting of W. E. Deming (chairman) and C, Eisonhart was ap- 
pointed to make a thorough study of this question with a view to advising the 
Civil Service Commission. The committee prepared a report in which three 
mam categories of statisticians in Government Service are distinguished: mathe¬ 
matical statisticians, statistical analysts and data-collecting statisticians. The 
report was transmitted to the Civil Service Commission with the approval of the 
Board of Directors. The members of this committee me to be commended for 

wi Tk T o limitation of lime al¬ 

lotted by the Civil Service Commission. The work on the problem of classifioa- 



REPORT OP THE PRESIDENT 


159 


tion of statisticians still goes on and a committee of experts consisting of mem¬ 
bers of the Washington Statistical Society, the Institute of Mathematical Sta¬ 
tistics, and the American Statistical Association has been set up to advise the 
Civil Service Commission on this problem. Our representatives on this com¬ 
mittee of experts are: W. E. Demmg, C. Eisenhart, M. H. Hansen and S. Wei,<jS. 

The advances m numerical computations in lecent years has made an enlarge¬ 
ment and reorganization of the Committee on Tabulation necessary. It.s present 
members are: R. L. Anderson, C. Eisenhart (Chairman), A. M. Mood, F. 
Mosteller, H. G Romig, L E. Simon, and J. W. Tukey. The objectives of this 
committee, as outlined by the chairman are: (1) to prepare a comprehensive 
list of new mathematical tables that would be of value in statistical theory and 
applications, (2) to assemble an American Collection of “Tables for Statisti¬ 
cians”, (3) to prepare a list of mathematical tables of importance in statistical 
theory and applications to be recommended for inclusion in the proposed Na¬ 
tional Bureau of Standards volume of “Tables for the Occasional Computer”. 
To implement the program of the committee, the following sub-committees have 
been constituted: (1) “On Computing Centers” with L. E. Simon as Chairman, 
(2) “On Ranks and Runs” with A. M Mood as Chairman, (3) “On Serial Cor¬ 
relations” with R. L. Anderson as Chairman, (4) “On 2x2 Tables” with C, 
Eisenhart as Chairman, (5) “On Order Statistics” with F. Mosteller as Chair¬ 
man, (6) “On Binomial, Poisson, and Hypergeometric Distributions” with 
H. G. Romig as Chairman, (7) “On Miscellaneous Tables” with J, W. Tukey 
as Chairman. 

On the recommendation of the membership committee, consisting of H. 
Scheff4 (chairman), 0. C. Craig, P, G. Hoel and F. F. Stephan, the following 
members have been elected as Fellows- J. Berkson, E. L. Lehmann, E, J G. 
Pitman, H. E, Robbins and C. M. Stem. The members of the finance com¬ 
mittee for 1948 were P. S. Dwyer (chairman), C F. Roos, L. A, Knowles and 
T. N. E Greville. 

The Nominating Committee for 1948 consisted of W. Bartky (chairman), 
C. C. Craig, J. F. Daly, H. A. Freeman, E L. Lehmann and W. G. Madow. The 
committee nominated J. Neyman for President, J. L. Dobb for President-Elect 
and 24 Council members for the 12 positions to be filled. In accordance with 
the provisions of the new constitution, the Nominating Committee for 1949 has 
also been appointed. The members of tliis Committee are: W. G. Cochran 
(Chairman), M. H. Hansen, H. B, Mann, A M. Mood and H. G. Romig. 

The Board of Directors has been exploring the possibilities for a closer co¬ 
operation with our colleagues abroad and for making foreign statistical jmblica- 
tions more easily accessible to our members. In particular, there has been 
correspondance with Professor E. S. Pearson, Managing Editor of Biomelrika, 
on the question of a possible reduction of the subscription rate of Biomelrika 
for our members. As a result of these discussions, Professor Pearson offered 
certain reductions, provided that a sufficient number of subscribers can be se¬ 
cured. Detailed information on this was contained in a memorandum of the 



160 


UTilPORT Of* TllK SECItljXATii"1 llKU 


Secretary, P. S. Dwyer, in the November mailiiiK to the memlier»lii[i. 11 
hoped that many of our members will make use of this oi)i)ort\mity. 

With the new constitutions of the American Stathtieal AsMHUatirni anil the 
Institute of Mathematical Statistics adopted, the way is ehuirwl for the eoii‘.itliTa- 
tion of possible federation plans of the various statistical organizat mns hy the 
Inter-Society Committee on Federation. J. H. Curtiss and P. S. tllnihlend eon- 
tinued to serve as onr representatives on the aforemenli'iued enmniitlee during 
1948. W Feller was our representative on the Policy t.omrnittee for Mathe¬ 
matics. and F. C. Hosteller and S. S. Wilks represented the Iiistitute on Uie 
Joint Committee for the Development of Statistical Application iu FnginftfTing 
and Manufacturing. W. Bartky was reappointed for a three-year term as our 
representative to the Division of the Physical Sciences of the National Rcisearcli 
Council, and H. Hotelling was our representative to tlie American Association 
for the Advancement of Science. 

In conclusion, I wish to thank all committee raembew and others who par¬ 
ticipated m the work of the Institute during the past year. The heaviMt bunion 
falls, of course, on the Secretary and it is liard to e.>:press adcupjatoly our ap¬ 
preciation for his unselfish efforts and devotion. The smooth and oflioient con- 
duct of the affairs of the Institute is largely due to his work. 

Adraham Wami 
Prcmknl, IP4S 

December 31, 1948 

REPORT OF THE SECRETARY-TREASURER OF THE INSTITUTE 

FOR 1948‘ 

At the beginning of 1948 the Institute had 1037 members and during the 
period covered by this report 120 new members (13 of wliom begin their mem¬ 
bership with 1949) joined the Institute and two mcmlicrs were re-in.statcd. 
During 1948 the Institute lost 64 members of which 24 were by rosignatiuft, 38 
by suspension for non-payment of dues and 2 by deatli. Judging from the 
information available at this date, the Institute ivill hat'e 1101 memlicrs as it 
starts 1949 

Deceased during the year were Dr. Otis A. Pope and H. !M. Tompkins. 

Meetings of the Institute held during 1948 included those at Columbia Uni¬ 
versity on April 14-15, at the Berkeley campus of the University of Califaruia. 
on June 22-24, at the University of Wisconsin on .September ti-IO, at the Uni¬ 
versity of Washington on November 20-27, and at CUnu'laiid on DecembeV 
26-30 The Secretary wishes to call attention to the excellent work of the 
members who served as assistant secretaries at these meclinga: ProfesHor 
Littaner. at New York, Professor Arnold at Madison, Professor liinibaum at 
Seattle and Professor Van Voorhis at Cleveland, 

^ This report covers the period January 1,1948 to December 20, 1018 ns the books were 
closed on December 20,1948 so that the report could be made at Llic annual mooting. 



REPORT OP THE SECRETARY-TREASURER 


161 


A summary of the financial transactions of the Institute is given in 
the Financial Siaiemenl for lSf8 which follows: 

FINANCIAL STATEMENT 
December 31, 1947 to December 20, 194K 

A RECEIPTS 

Balance on Hand,’ December 31, 1947 . . . $5,858,37 

Dues ... . . ... , 7,482.21 

Contributions . . , . , 255,50 

Subscriptions . , 3,660 40 

Sale of Back Numbeis . . 2,718 27 

Income from Investments , . 100 00 

Advertising . . 160,00 

Miscellaneous , , , 57 24 


Total , , , , , $20,291,00 

B. EXPENDITURES 

Annals—Current 

Olllcc of the Editor , . . , , $175,00 

Waverly Press ,, , , , 7,824 66 $7,999.66 


Annals—Back Numbers 

Ilpprinted Vol. XI »2 & #3; XII M2 & M3, XIV M4 . 1,968 50 

Mathematical Ilcvicws and Intcr-Sooioty Committee. 225 OO 

Office of the Secretary-Treasurer 

Printing, memoranda, etc, (including some stamped enveloped) 1,174,62 

Postage,, supplies, ex-press, telephone calls. 226 00 

Clerical help . . . 1,468,00 

Travelling Expense. . , . 30.48 2,898.00 


Miscellaneous. . .... 79,82 

Balance on Hand,** December 20, 1948 . 7,121 01 


Total .$20,291.99 

C, SUMMARY OP RECEIPTS AND EXPENDITURES 

Balance on Hand,** December 31, 1947. . $6,868.37 

Ileceipts during 1948..... 14,433.62 

Expenditures during 1948. .. . 13,170,98 

Balanco on Hand,** December 20, 1948 . 7,121.01 

♦* In bank deposits and government bonds. 

D. lim MEMBERSHIP FUNDS 

It has been the practice to place all life membership payments in a special fund (most of 
which is in government bonds) and to hold all these funds in reserve until the death of the 
member—nfter which his payment is released to the general fund. There wore no new life 


In bank deposits and government bonds. 









162 


REPORT OP THE SECKETA.RY-TBEASORER 


membership payments in 1948. During the year a transfer to the general fund has been 
made of the life membership payment of Professor Irving Fisher, who died in 1947. 

Dicember Dteember 

SI, mr 10, JS4S 

Number of Lite Members . 30 2{) 

U S. Government Bonds . , .$1,888.00 $1,888.00 

Bank Deposits . 427.00 302,00 

Total .$2,316.00 $2,280.00 

E BACK ISSQES PaND 

It has been our policy, aince Januaiy 1, 1948, to use income from the sale of back issues 


to finance the additional reprinting of back issues. 

Income from the Sale of Back Issuea during 1948 .. $2,718.27 

Expense for Reprinting Back Issues in 1948 . 1,968.60 

Balance m the Fund, December 20, 1948 . $749.77 


At present 500 copies of Volume 13 #1 and »2 are being reprinted at a cost of $736.00. 
The payment of this in January will leave a small balance in the fund. 


P COMPABIaON OF ASSETS ON HECEMBER 31, 1947 AND Decbmbkb 20, 1948 


U. S GoverBment G Bonds . 

Life Membership Funds . 

Back Issues Fund. 

Additional Bank Deposits . 

Current Accounts Receivable. 

Estimated Value (Cost) of Back Annals' 

Total .... . 

Net Gam 1948 . . 


}9i7 

ISIS 

... $3,000.00 

$3,000.00 

... 2,316,00 

2,280.00 

. — 

749.77 

643.47 

1,091.24 

... 423 65 

201,22 

... 10,806.73 

12,785.61 

... $17,148.06..,. 

... $20,107.84 
... 3,049,19 


G LIABILITIES OF 1N.STITUTE OP MATHEMATICAL STATISTICS AS OP BECE.MBEn 20, 1948 

All bills which have been presented have been paid. The Life Membership Fund now 
oontaina $2,280,00 which covers 29 members. Also, $4,000.60 has been paid in for oon- 
tributions and 1949 duos End subscriptions. 


This report does not cover the amount of $13.95 which is held by the Institute 
for the fund for Annals for Countries Devastated by the War (This fund has 
been under the supervision of Professor Neyman.) During the year this fund 
purchased $376 25 in back issues (at the agreed rate of $4.50 per volume) which 
has contributed to the total i^les in back issues. 

There has been little change in the life inemhership fund during the year. 
Our practice of malcing no transfer of life membership funds until the death 
of the member is most conservative and protects the interests of the life member. 

1 n oQc of oor inventory is always difficult. We now have 

19,083 issues of the Annals. At 67per copy, it appears that $12,785.01 is a 
fair estimate of the ir actual cost. This is in fact less than 5 times the actual 

' Coat of AnnaU calculated at 67 cents per copy. 
















REPORT OF THE EDITOR 


1G3 


income from back issues this year and hence seems to be a very conservative 
estimate of the marketable (within ten years) value of our piesent inventory. 

We are in a position now to continue to supply all issues beginning with volume 
7 and expect that the sales in back volumes will be such that within two or three 
years we will be able to reprint the 9 issues in volumes 1-6 which are now prac¬ 
tically or completely exhausted. 

It appears that the increase in dues and subscriptions has been adequate to 
take care of the increased expense during 1948. No bonds have been cashed 
during the year. Additional funds appear necessary for 1949, however, since 
the present amount of clerical help in the office of the Secretary-Treasurer is 
utterly inadequate. The employment of additional secretarial assistance, which 
the Institute must have, will increase the total expense of this office by about 
$1,200.00. It IS necessary, too, to provide a cushion for a possible increase in 
our Waverly bill, which is up about 10% in 1948. It appears that we may 
need from $1,500.00 to $2,000.00 additional funds for 1949. Available sources 
are increases m the number of members and subscribers, contributions from 
our members, and institutional contributions and memberships. 

Patje S. Dwyer 
Secretary- Treasurer 

December 21, 1948 


REPORT OF THE EDITOR FOR 1948 

During 1948 the rate of submission of manuscripts for publication in the 
Annals has continued to increase. The size of the Annals was held approxi¬ 
mately to that set for 1947, the number of pages printed in 1948 being 610. 
The 1948 volume of the Annals contained 59 papers, of which 24 were short 
notes. 

During the past year the backlog of papers has increased to nearly two issues. 
Thus manuscripts submitted now, especially the longer ones, must wait at least 
six months after being refereed in order to be printed. If the rate at which 
manuscripts are submitted increases, as it has during the last two years, this 
waiting gap may increase to a year by the end of 1949. 

If additional funds could be found, it would be highly desirable to increase the 
Annals to 700 pages in 1949. 

The manuscripts being received continue to cover a rather wide range of 
topics in probability and statistics. Almost all of them are research papers. 
In the Editor’s opinion it would be highly desirable for the Institute to take steps, 
perhaps through invited addresses, to secure good expository and review articles. 
Sustained attempts have been made over a period of years to obtain such articles 
by invitation, but with little success. 

The Editor wishes to take this opportunity to acknowledge, on behalf of the 
Editorial Committee, the generous'refereeing assistance which has been given by 



164 


REPORT OF THE EDITOR 


the following persons: Z W. Birnbaum, A H. Bowker, I W. Burr, G. W. Brown, 
K. L. Chung, W J Dixon, A. Dvoretzsky, T. N. E Greville, F. E. Grubbs, 
M. H. Hansen, T. E. Harris, G. Hastings, H. B Horton, G. A. Hunt, B. F. 
Kimball, T. Koopmans, H. Levene, M. S MacPhail, P. J. McCarthy, R. B. 
Murphy, M. P. Peisakoff, P. S. Olmstead, E. Paulson, H. G. Romig, L. J, Savage, 
F. F. Stephan, D F Votaw and J. E. Walsh. 

The Editor owes special acknowdedgment to Mr. IM. E. P'reernan for prepara¬ 
tion of manuscripts and to Mrs. Frances M. Purvis for other editorial and office 
assistance. 

S. S. WlEKS 
Hdtlor 


December 31, 1948. 



STATISTICAL DECISION FUNCTIONS 

By Abraham Wald^ 

Columbia University 

Introduction and summary. The foundations of a general theory of statistical 
decision functions, including the classical non-sequential case as well as the 
sequential case, was discussed by the author in a previous publication 
[3]. Several assumptions made in [3] appear, however, to be unnecessarily re¬ 
strictive (see conditions 1-7, pp. 297 in [3]). These assumptions, moreover, 
are not always fulfilled for statistical problems in their conventional form. In 
this paper the main results of [3], as well as several new results, are obtained 
from a considerably weaker set of conditions which are fulfilled for most of the 
statistical problems treated in the literature. It seemed necessary to abandon 
most of the methods of proofs used in [3] (particularly those in section 4 of [3]) 
and to develop the theory from the beginning. To make the present paper self- 
contained, the basic definitions already given in [3] are briefly restated in 
section 2.1. 

In [3] it is postulated (see Condition 3, p. 207) that the space 0 of all admissible 
distribution functions F is compact. In problems where the distribution func¬ 
tion F is known except for the values of a finite number of parameters, i.e., where 
12 is a parametric class of distribution functions, the compactness condition will 
usually not be fulfilled if no restrictions are imposed on the possible values of the 
parameters. For example, if 12 is the class of all univariate normal distributions 
with unit variance, 12 is not compact. It is true that by restricting the parameter 
space to a bounded and closed subset of the unrestricted space, compactness of 
12 will usually be attained. Since such a restriction of the parameter space can 
frequently be made in applied problems, the condition of compactness may not 
be too restrictive from the point of view of practical applications. Nevertheless, 
it seems highly desirable from the theoretical point of view to eliminate or to 
weaken the condition of compactness of 12. This is done in the present paper. 
The compactness condition is completely omitted in the discrete case (Theorems 
2.1-2.5), and replaced by the condition of separability of 12 in the continuous 
case (Theorems 3.1-3.4). The latter condition is fulfilled in most of the conven¬ 
tional statistical problems. 

Another restriction postulated in [3] (Condition 4, p. 297) is the continuity 
of the weight function IP'(F, d) in F. As explained in section 2.1 of the present 
paper, the value of Tr(F, d) is interpreted as the loss suffered when F happens to 
be the true distribution of the chance variables under consideration and the 
decision d is made by the statistician. While the assumption of continuity of 
WiF, d) in F may seem reasonable from the point of view of practical applica¬ 
tion, it is rather undesirable from the theoretical point of view for the following 

1 Work done under the Bponsorehip of the OflSce of Naval Kesearch. 

166 



166 


ABKAHAM WABD 


reaaons. It is of considerable theoretical interest to consider simplified weight 
functions {F^ d) which can take only the values 0 and 1 (the value 0 corresponds 
to a correct decision, and the value 1 to a wrong decision). Frequently, such 
weight functions are necessarily discontinuous. Consider, for example, the 
problem of testing the hypothesis H that the mean 9 of a normally distributed 
chance variable X with unit variance is equal to zero. Let di denote the decision 
to accept H, and di the decision to reject R. Assigning the value zero to the 
weight W whenever a correct decision is made, and the value 1 whenever a 
wrong decision is made, we have: 

Wie, di) = 0 for e = 0, and = 1 for 0 5 ^ 0; IF (0, dj) = 0 for 9 0, 

and ■= 1 for 9 =« 0. 

This weight function is obviously discontinuous. In the present paper the 
main results (Theorems 2.1-2.5 and Theorems 3.1-3.4) are obtained without 
making any continuity assumption regarding W{F, d). 

The restrictions imposed in the present paper on the cost function of experi¬ 
mentation are considerably weaker than those formulated in [3]. Condition 6 
[3, p. 297] concerning the class of admissible distribution functions, and condi¬ 
tion 7 [3, p. 298] concerning the class of decision functions at the disposal of 
the statistician are omitted here altogether. 

One of the new results obtained here is the establishment of the existence 
of so called minimax solutions under rather weak conditions (Theorems 2.3 and 
3.2). This result is a simple consequence of two lemmas (Lemmas 2.4 and 3.3) 
which seem to be of interest in themselves. 

The present paper consists of three sections, In the first section several 
theorems are given concerning zero sum two person games which go somewhat 
beyond previously published results. The results in section 1 are then applied 
to statistical decision functions in sections 2 and 3. Section 2 treats the case of 
discrete chance variables, while section 3 deals with the continuous case. The 
two cases have been treated separately, since the author was not able to find 
any simple and convenient way of combining them into a single more general 
theory. 

1. Conditions for strict determinateness of a zero sum two person game. 
The normalized form of a zero sum two person game may be defined as follows 
(see [1, section 14.1]); there are two players and there is a bounded and real 
valued function K{a, h) of two variables a and h given where a may be any point 
of a space A and b may be any point of a space B. Player 1 chooses a point 
o in A and player 2 chooses a point 6 in j 8, each choice being made in complete 
ignorance of the other. Player 1 then gets the amount K{(i, i) and player 2 the 
amount ~K(a, b). Clearly, player 1 wishes to maximize K(fl, b) and player 2 
wishes to minimize K{a, b). 

Any element o of A will be called a pure strategy of player 1, and any dement 



STATISTICAL DECISION FUNCTIONS 


1G7 


b of 5 a pure strategy of player 2. A mixed strategy of player 1 is defined as 
follows: instead of choosing a particular element o of -d, player 1 chooses a 
probability measure ? defined over an additive class 31 of subsets of A and the 
point a is then selected by a chance mechanism constaicted so that for any 
element a of 31 the probability that the selected element a will be contained in 
a is equal to ^{a) Similarly, a mixed strategy of player 2 is given by a probabil¬ 
ity measure ij defined over an additive class 39 of subsets of B and the element h 
is selected by a chance mechanism so that for any element /3 of 58 the probability 
that the selected element b will be contained in d is equal to ijIjS). The expected 
value of the outcome K{a, b) is then given by 

(1.1) K*(^, v) = f f K(a, b) df dn. 

jfl 

We can now reinterpret the value of K(a, b) as the value of K*il:a, Vb) where {a 
and 1 ) 1 , are probability measures which assign probability 1 to a and b, respec¬ 
tively. In what follows, we shall write v) for K*{i, i)), K{a, h) will be used 
synonjunously with 6)> n) synonymously with A(£a, »)) and b) 
synonymously with rib). This can be done without any danger of confusion. 
A game is said to be strictly determined if 

(1.2) Sup Inf Kii, ri) = Inf Sup K(^, r,). 

tv V ( 

The basic theorem proved by von Neumann [1] states that if A and B are 
finite the game is always strictly determined, i.e,, (1.2) holds. In some previous 
publications (see [2] and [3]) the author has shown that (1.2) always holds if one 
of the spaces A and B is finite or compact in the sense of some intrinsic metric, 
but does not necessarily hold otherwise, A necessary and sufficient condition 
for the validity of (1.2) was given in [2] for spaces A and B with countably many 
elements. In this section we shall give sufficient conditions as well as necessary 
and sufficient conditions for the validity of (1.2) for arbitrary spaces A and B. 
These results will then be used in later sections. 

In what follows, for any subset a of A the symbol will denote a probability 
measure ^in A for which $(«) = 1. Similarly, for any subset poiB,^ will stand 
for a probability measure ?) in JS for which r){0) = 1, We shall now prove the 
following lemma. 

Lemma 1.1. Lei {«<) (f = 1, 2, • • • , ad inf.) be a sequence of subsets of A 
such ihat a, C a,'+i and let a — . Then 

(1.3) lim Sup Inf iC(f„,, ij) = Sup Inf , rj). 

1 ftt V 

Proof: Clearly, the limit of Sup Inf 2C(|ai > ’j) exists as f « and cannot 

fa, 1 

exceed the value of the right hand member in (1.3). Put 

(1.4) lim Sup Inf , ?)) = p 



168 


abbaham tvalp 


and 

(1.5) Sup Inf K(^a ,»?) = /> + 5 (5 > 0). 

fa 1 

Suppose that 5 > 0. Then there exists a probability measure fa such that 


ml . n) ^ p + 


for all Tj. 


(1.6) i>-v,4a 1 '// = r I 2 

Let fa. be the probability measure defined as follows: for any subset a* of «< 
we have 

(1.7) {l.(a*) = §^. 




Then, since lim {« (at — ai) = 0, we have 


( 1 . 8 ) 


limK(f“„.,,) =K(f“ ,u) 


uniformly in tj. Hence, for suflBoiently large i, we have 


(1.9) Inf ml, ,v)>P + L 

u u 

which is a contradiction to (1 4). Thus, 5 = 0 and Lemma 1.1 is proved. In¬ 
terchanging the role of the two players, we obtain the following lemma. 

Lemma 1,2. Let {^i) he a sequence of subsets of B such that Pi CZ pi+i and hi 
= fi. Then 

(1.10) lim Inf Sup m, V^i) = luf Sup m, Vf)- 

'll, ( IS £ 

We shall now prove the following lemma. 

Lemma 1.3. The inequality^ 

(1.11) Sup Inf m. v) < Inf Sup m, v) 

£ f 1 ( 

always holds. 

Proof: for any given « > 0, it is possible to find probability measures and 
T?" such that 

(1.12) Inf Sup m, v) S Sup m, n‘) - « 

1 £ f 

and 

(1.13) Sup Inf m, 1 ?) ^ Inf ml v) + 

6 9 


* This inequality was given by v. Neumann [1] for finite spaces A and B. 



STATISTICAL DECISION FUNCTIONS 


1G9 


Then we have, 

(1.14) Sup Inf K(^, n) < Inf K(^\ v) + ^ < Kii”, v) + « 

i 1 V 

< Sup Z(f, n) + e s Inf Sup K(s, V ) + 2 
f 1 f 

Since c can be chosen arbitrarily small, Lemma 1.3 is proved. 

Theorem 1.1. If a is a subset of A such that 

Sup Inf JC(f«, 1 ?) = Inf Sup ■>]) 

fa II H fa 

and 

Inf Sup If (fa, Hi) = Inf Sup 7f(f, i?), 

5 fa » 

then 

Sup Inf i£:(f, n) == Inf Sup i,). 

t 1 it 

Proof : Clearly, 

(1.15) Sup Inf iiC(fa , n) < Sup Inf K(X, i]) 

(an f 1 

and 

(1.16) Inf Sup Jf(fa, n) < Inf Sup X(f, t?). 

n (a if 

If the left hand members of (1.15) and (1.16) are equal to each other and 
equal to the right member of (1.16), then 

(1.17) Sup Inf K{^, v) > Inf Sup K((, v). 

f 1 it 

Because of Lemma 1.3 the equality sign must hold and Theorem 1.1 is proved. 
Interchanging the two players, we obtain from Theorem 1.1: 

Theorem 1,2. If is a subset of B such that Sup Inf ^C(f, 1 J 0 ) = Inf Sup K(^, 77 ^) 

and Sup Inf K(^, ns) = Sup Inf K(^, 17 ), 
t 1 /j f 1 

then 

Sup Inf K{i, v) = Inf Sup hCff, 77 ). 

£ 1 if 

We shall now prove the following theorem. 

Theorem 1.3. If {aj} is a sequence of subsets of A such that C a<+i and 

00 

2 «< = .d, and if 

(1.18) 


Sup Inf W(fa,, 77 ) = Inf Sup , n) 



170 


ABEAHAM WALD 


for each i, then a 'necessary and sufjicieni eondtiion for the validilij of 

(1.19) Sup Inf K{^, v) = Inf Sup 7C(f, y) 

in n i 

is that 

(1.20) lim Inf Sup y) = Inf Sup K{i, v)- 

1 ta,. 1 f 

PEOor: Because of (1.18) and Lenuna 1.1 we have 

(1.21) lim Inf Sup /^(fa., ij) = Sup Inf if(f, rj). 

1 t«, t 1 

Hence, (1.20) implies (1.19) and (1.19) implies (1.20). This proves Theorem 1.3. 

Interchanging the role of the two players, we obtain from Theorem 1.3 the^ 
following theorem. 

Thboeem 1.4. If {^,1 is a sequence of svbseis of B such that (3,- C /?;+! and 

w 

^ ( 9 , = ( 3 , a-nd if 
1=1 

Sup Inf K(£, = Inf Sup if (f, 

£ If, ( 

then a necessary and sufficient condition for the validity of (1.19) is that 
(1-22) lim Sup Inf if(f, yf,) = Sup Inf if ({, ij). 

S Wv £ D 

In [3] an intrinsic metric was introduced in the spaces A and B, The distance 
of two elements ui and Oj of .4 is defined by 

(1.23) S(oi, Os) = Sup I if(ai ,b) ~ Kim, b)\. 

b 

Similarly, the distance between two points bi and i>s of 5 is defined by 
(1'24) J(&,, bs) = Sup [ if(a, bi) - if (a, bs) \ . 

a 

Suppose that there exists a sequence (a,-) of subsets of A such that at is con- 

8ft 

ditionally compact, a; C a(+i and 23 q:> = A.’ It was shown in [3] that for 

1-1 ■' 

any conditionally compact subset a, the relation (1.18) holds. Hence, according 
to Theorem 1.3, a necessary and sufficient condition for the validity of (1.10) 
is that (1.20) ^olds for a sequence («<) where at is conditionally oompaot^ 

at C «i+i and ^ cn = 1. Similar remarks can be made concerning the space B. 

The distance definitions given in (1,23) and (1.24) can be extended to the spaces 
of the probability measures ( and ijj respectively. That ia^ 

5(fi, fo) = Sup 1 if(£,, y) - Kik,, y) 1 



STATISTICAL DECISION FUNCTIONS 171 

■and 

<1-26) 5(7/1, Vi) = Sup 1 Tji) - 7f (f, 7/a) 1 . 

We shall say that a probability measure f is discrete if there exists a denumer¬ 
able subset a of 4 such that f(a) = 1. Similarly, a probability measure 7/ will 
be said to be discrete if 7/(/3) = 1 for some denumerable subset /3 of B. We shall 
now prove the following theorem. 

Theorem 1.5. If the choice of flayer 1 is restricted to elements of a class C of 
probability measures ^ in which the class of all discrete probability measures f is 
•dense., then a necessary and sufficient condition for the game to be strictly determined 
is that there exists a sequence jo,) of elements of A such that 

<1.27) lim Inf Sup K(U , = Inf Sup K(^, t/) 

•-« 1 1 ( 

where 

ci{ = (ui j Ua, • * * , a,|. 

Proof; Since the class of all discrete probability measures f lies dense in the 
■class C, there exists a sequence a = {a<} (i = 1, 2, • • ■ , ad inf.) 

.such that 

<1.28) Sup Inf K{U , v) = Sup Inf K{^, v). 

(a V ( <1 

Since on = {oi, • • • , a,} is finite, we have 

<1.29) Inf Sup K(^a,, v) = Sup Inf , t/). 

It then follows from Lemma 1.1 that 

(1.30) lim Inf Sup K((at , v) = Sup Inf iC(f«, v) = Sup Inf K($, 77 ). 

I—“ V tir{ (a 1 I 1 

■Clearly, (1.30) and strict determinateness of the game implies (1.27). On the 
other hand, any « == {o,} that satisfies (1.27), will satisfy also (1.28) and (1.30). 
But (1.27) and (1.30) imply that the game is strictly determmed. Thus, 
Theorem 1.5 is proved. 

Theorem 1.6. If the choice of player 2 is restricted to elements of a class C of 
probability measure 7 / in which the class of all discrete probability measures 7 / lies 
dense, then a necessary and sufficient condition for the strict delerminaieness of the 
game is that there exists a sequence => {l><} of elements of B such that 

(1.31) Um Sup Inf Ki^, vh) = Sup Inf K{li, v) 

I 1 

where 

~ {hi, * • • , b j}. 

This theorem is obtained from Theorem 1.5 by interchanging the players 
1 and 2. 



172 


ABRAHAM WALT) 


2. Statistical decisioa functions: the case of discrete chance variable. 

2.1. The ’problem of siatistical deeisiona and its inlerpreUilion as a zero sum (wo 
person game. In some previous publications (see, for example, [3]) the author 
has formulated the problem of statistical decisions as follows: Let X = {X’l 
(i = 1,2, • • • , ad inf.) be an infinite sequence of chance variables. Any particu¬ 
lar observation x on X is given by a sequence x = (x'} of real values whore x' 
denotes the observed value of X*. Suppose that the probability distribution 
F(x) of X is not known. It is, however, known that F is an element of a given 
class f2 of distribution functions. There is, furthermore, a space D given whose 
elements d represent the possible decisions that can be made in the problem 
under consideration. Usually each element d oi D will be associated with a 
certam subset w of fl and making the decision d can be interpreted as accepting 
the hypothesis that the true distribution is included m the subset w. The funda¬ 
mental problem in statistics is to give a rule for making a decision, that is, a 
rule for selecting a particular element d of U on the basis of the observed sample 
point X. In other words, the problem is to constrnct a function d(x), called 
decision function, which associates with each sample point x an element d(x) 
of D so that the decision d{x) is made when the sample point x is observed. 

This formulation of the problem includes the sequential as well as the classical 
non-sequential case. For any sample point x, let n(x) be the munber of com¬ 
ponents of X that must be known to be able to determine the value of d(x). In 
other words, ?i(x) is the smallest positive integer such that d(,y) ■=> d{x) for any y 
whose first n coordinates are equal to the first n coordinates of x. If no finite 
n exists with the above property, we put n = <«. Clearly, n(x) is the number 
of observations needed to reach a decision. To put in evidence the dependence 
of ?i(x) on the decision rule used, we shall occasionally write n(x; 35) instead of 
n(x) where 35 denotes the decision function d(x) used. If n(x) is constant over 
the whole sample space, we have the classical cose, that is the cose where a 
decision is to be made on the basis of a predetermined number of observations. 
If n(x) is not constant over the sample space, we have the sequential case. A 
basic question in statistics is this: What decision function should be chosen by 
the statistician in any given problem? To set up principles for a proper choice of 
a decision function, it is necessary to express in some way the degree of im¬ 
portance of the various wrong decisions that can be made in the problem under 
consideration. This may be expressed by a non-negative function F(F, d), 
called weight functions, which is defined for all elements F oiU and all elements 
d of U. For any pair (F, d), the value TFfF, d) expresses the loss caused by 
makmg the decision d when F is the true distribution of X. For any positive 
mteger n^let c(n) denote the cost of making n observations. If the decision 
function 35 - d{x) is used the expected loss plus the expected cost of experi¬ 
mentation is given by ^ 

r[F, n = W\F, d(x)] dF(x) -b c(n(x)) dF(x) 


( 2 . 1 ) 



STATISTICAL DECISION EUNCTION8 


173 


where M denotes the sample space, i.e. the totality of all sample points a:. We 
shall use the symbol 5D for d{x) when we want to indicate that we mean the whole 
decision function and not merely a value of d{x) coresponding to some x. 

The above expression ( 2 . 1 ) is called the risk. Thus, the risk is a real valued 
non-negative function of two variables F and S) where F may bo any element 
of and any decision rule that may be adopted by the statistician. 

Of course, the statistician would like to make the risk r as small as possible. 
The difficulty he faces in this connection is that r depends on two arguments F 
and ®, and he can merely choose T) but not F. The true distribution F is chosen, 
we may say, by Nature and Nature’s choice is usually entirely unknown to the 
statistician. Thus, the situation that arises here is very similar to that of a 
zero sum two person game. As a matter of fact, the statistical problem may be 
interpreted as a zero sum two person game by setting up the following corres¬ 
pondence. 

Two Person Game Statisheal Decision Problem 

Nature 
Statistician 

Choice of true distribution F by Nature 
Choice pf decision rule 3) = d(*) 

Space Q 

Space Q of decision rules 3) that can be used by 
the statistician. 

Risk r{F, 5D) 

Probability measure f defined over an additive 
class of subsets of Q (a priori probability dis¬ 
tribution in the space fl) 

Probability measure ij defined over an additive 
class of subsets of the space Q. We shall refer 
to rt as randomized decision function. 

Riskr(f, 7 ;)= f f r(F,3))d£dii, 

Jq Jo 

used. 

2 . 2 . Formulation of some conditions concerning the spaces fi, D, the weight func¬ 
tion W {F, d) and the cost function of experimentation, A general theory of statis¬ 
tical decision functions was developed in [3] assuming the fulfillment of seven 
conditions listed on pp. 297-8.^ The conditions listed there are unnecessarily 
restrictive and we shall replace them here by a considerably weaker set of con¬ 
ditions. 

In this chapter we shall restrict ourselves to the study of the case where each 
of the chance variables X^, X®, ■ ■ •, ad inf. is discrete. We shall say that a chance 

^ In [3] only the continuous case is treated (existence of a density function is assumed), 
but all the results obtained there can be extended without difficulty to the discrete case. 


Player 1 
Player 2 

Pure strategy a of player 1 
Pure strategy h of player 2 
Space A 
Space B 

Outcome X(o, h) 

Mixed strategy $ of 
player 1 

Mixed strategy rj of 
player 2 

Outcome K(^, v) when 
mixed strategies are 



174 


ABHAHAM WAIiD 


variable is discrete if it can take only countably many different values. Let 
0 , 1 , a, 2 , • • ■ , ad inf. denote the possible values of the chance variable X\ Since 
it is immaterial how the values atj are labeled, there is no loss of generality in 
putting a,i = jO = 1, 2, 3, ■ • ■ , ad inf.). Thus, we formulate the following 
condition. 

Condition 2.1. The chance variable X* (i = 1, 2, ad inf.) can take only 
positive integral valuee. 

As in [3], also here we postulate the boundedness of the weight function, i.e., 
we formulate the following condition. 

Condition 2 .2. The weight function W{F, d) is a hounded function of F and d. 

To formulate condition 2.3, we shall introduce some definitions. Lot w bo a 
given subset of Q. The distance between two elements di and diotD relative to 
u is defined by 

(2.2) Kdi , di; oj) = Sup I W{F, df) - IFCii’, dj) |. 

We shall refer to b{di , dj; fi) as the absolute distance, or more briefly, the dis* 
tance between di and dj. We shall say that a subset D* of D is compact (con¬ 
ditionally compact) relative to w, if it is compact (conditionally compact) in. 
the sense of the metric 3(di, dj; w). If D* is compact relative to 11, we shall 
say briefly that D* is compact. 

An element d of ID is said to be umformly better than the element d' of D rela¬ 
tive to a subset co of 12 if 

W(F,d) 5 W(/!’, dO for all i? into 

and if 

W{F, d) < W{F, d') for at least one f in w. 

A subset D* of D is said to be complete relative to a subset w of 12 if for any d 
outside D* there exists an element d* in D* such that d* is uniformly better than 
d relative to oi. 

Condition 2.3. For any positive integer i and for any positive e there exists a 
subset Di,, of D which is compact relative to 12 and complete relative to U{,, where 
ui,, is the class of dll elements F of for which prob g t) ^ e. 

If L is compact, then it is compact with respect to any subset u of 12 and Con¬ 
dition 2.3 is fulfilled. For any finite space D, Condition 2,3 is obviously ful¬ 
filled. Thus, Condition 2.3 is fulfilled, for example, for any problem of testing 
a statistical hypothesis H, since in that case the space D contains only two ele¬ 
ments di and di where di denotes the decision to reject H and dj the decision to 
accept H. 

In [3] it was assumed that the cost of experimentation depends only on the 
number of observations made. This assumption is unnecessarily restrictive. 
The cost may depend also on the decision rule iD used. For example, let 5Di 
and SDj be two decision rules such that n{x\ 5Di) is equal to a constant n®, while 



STATISTICAL DECISION FUNCTIONS 


175 


IS such that at any stage of the experimentation where ®2 requires talcing at 
least one additional observation the probability is positive that experimentation 
will be terminated by taking only one more observation. Let be a particular 
sample point for which n(a;°; 2)a) = n{x°, S5i) = no. There are undoubtedly 
cases where the cost of experimentation is appreciably increased by the necessity 
of having to look at the observations at each stage of the experiment before we 
can decide whether or not to continue taking additional obsei-vations Thus 
in many cases the cost of experimentation when a:® is observed may be greater 
for than for jDi . The cost may also depend on the actual values of the ob¬ 
servations made. Thus, we shall assume that the cost c is a single valued func¬ 
tion of the observations , x"' and the decision rule ® used, i.e., c = 

c(x\ • • • , x", 5D). 

Condition 2.4. The cost c(x^, • • • , x”', T)) is non-negative and lim 
c(x^, ■ ■ ■ , x", S)) = « uniformly in , x™, T) as m —* «. For each pos¬ 

itive integral value m, there exists a finite value Cm, depending only on m, such 
that c(x\ ■ ‘ , x”, 25) ^ c„ identically in x, • • • , x", T). Furthermore, 
c(x\ • ■ • , x™, ©i) = c(x^, • • • , x”, 252 ) if 7i(x*, 25i) = n(x; 25s) for all x. Finally, 
for any sample point x we have c(x‘, • • • , 25i) g c(x', • ■ ■ , 25s) 

if there exists a positive integer m such that n{x, 25i) = 7i(x, 2 ) 3 ) when n(x, 2 ) 3 ) < m 
and n(x, 25i) = m when n{x, 253 ) S m. 

2.3 Alternative definition of a randomized decision function, and a further con¬ 
dition on the cost function. In Section 2.1 we defined a randomized decision 
function as a probability measure y defined over some additive class of subsets 
of the space Q of all decision functions d(x). Before formulating an alternative 
definition of a randomized decision function, we have to make precise the mean¬ 
ing of by stating the additive class Cq of subsets of Q over which v is defined. 
Let Cd be the smallest additive class of subsets of D which contains all subsets 
of D which are open in the sense of the metric 5(di ,d 2 ; 12), For any finite set of 
positive integers Oi, ■ • • , a* and for any element D* of Co , let Q(ai, • • • , a* , 
D*) be the set of all decision functions d(x) which satisfy the following two con¬ 
ditions: (1) If x' = ai, X® = Os, • ■ • , x*' = o* , then n(x) = fc; (2) If x^ = Oi, ■ • ■ , 
x* = o*, then d(x) is an element of D*. Let Cg be the class of all sets Q(ai, 
■ ■ • , Oh, D*) corresponding to all possible values of fc, oi, ■ ■ ■ , a* and all pos¬ 
sible elements D* of Co ■ The additive class Cq is defined as the smallest 
additive class containing Cj as a subclass. Then with any r) we can associate 
two sequences of functions 

,x”'\v)] 

and 

J5*t. ,m(D* 17;))(m = 1, 2, • ■ ■ , ad inf.) 

where 0 g 2 m (x^, • ■ ■ , x” |»;) g 1 and for any x\ • ■ • , x", is a prob¬ 

ability measure in D defined over the additive class Co • Here 

3m(x\ ■ ■ • , X”* 1 7)) 



176 


ABRAHAM WALD 


denotes the conditional probability that n(.x) > m under the condition that 
the first m observations are equal to , x" and experhnentetioa has not 

been terminated for (x*, ■ ■ ■ , x*) for (A = 1, 2, ■ •' , w — 1), wiule 

Ssi:i:a"(P* 1 n) 

is the conditional probability that the final decision cl will be an element of il* 
under the condition that the sample (x\ ■■■ ,x”) is observed and n(x) ® m. 
Thus 

1 r))®2(x^, X* I i;) • ■ ' Sm-t(x^) ■ • • > x" ^ 1 1?) [1 -* Xm(x ,•••,» 1 I/)] ** 


(2.3) 


v[(3(x\ , ®", J?] 


and 


(2.4) 


I Jj) 


■■■ ,x”D*)} 
vlQC^\ ' • • , X", i))] ■ 


We shall now consider two sequences of functions l 2 «(x\ , x*)) and 

,m(D*)}, not necessarily generated by a given ij. An alternative definition 
of a randomized decision function can be given in terms of Uioac two floquonces 
as follows; After the first observation x* has been drawn, the statistician deter¬ 
mines whether or not experimentation be continued by a chance mecbauiem 
constructed so that the probability of continuing nxperimentation is equal to 
Zi(x'). If it is decided to terminate experimentation, the statistician uses a 
chance mechanism to select the final decision d constructed so that the prob¬ 
ability distribution of the selected d is equal to 6 ,i(Z)*). If it is decided to take 
a second observation and the value x“ is obtained, again a chance mechanism is 
used to determine whether or not to stop experimentation such that the prob¬ 
ability of taking a third observation is equal to 2 j(x\ x*). If it is decided to stop 
experimentation, a chance mechanism is used to select the final d so that the 
probability distribution of the selected d is equal to S,i,»(D*), and BO on. 

We shall denote by f a randomized decision function defined in tonns of two 
sequences (zm(x\ , x"")) and { 6 *i...,«(jD*)}, as described above. Clearly, 
any given 17 generates a particular f. Let f(r?) denote the f generated by 1 ). 
One can easily verify that two different ij's may generate the same f, i.e., there 
exist two different ij's, say i;i and i/j such that ^( 771 ) »=■ fCijs). 

We shall now show that for any f there exists an 17 such that f ( 17 ) «• f. Let 
f be given by the two sequences • • • , x”)) and . Let b, 

denote a sequence of r, positive integers, i.e., 67 = (bfi, • ■ • , 6 /.,,) (j' 1, 2 , • > < , fc) 

subject to the restriction that no h; is equal to an initial segment of bi(j 1), 
Let, furthermore, Z>f, ■ • • , D? be fc elements of Co . Finally, let Q(bi , • ■ < , 
bk, Di , • • ■ , D*) denote the class of all decision functions d(x) which, satisfy 



STATISTICAL DECISION FUNCTIONS 


177 


the following condition: If {x, • • • , x'*) = b,- then n{x) = r, and d{x) is an ele¬ 
ment of Dj(j = 1, • • ■ , k). Let 7j be a probability measure such that 

vlQibi,--- ,Dt)] 


(2.5) 




n •■•n 

m—1 »***—1 a "'"'!--! » 1<—1 




holds for all values of fc, bi, • • • , b*, Dt, • ■ • , D* . Here gm(x^, • ■ ■ , aj") = 

1 if (x^, • • ■ , a:’”) is equal to an initial segment of at least of one of the samples 

bi) • • • , bfc j but IS not equal to any of the samples bi, • • ■ , b*. In all other 

cases • • , a;"') = 0. The function g*mix^, ‘ , a:™) is equal to 1 if 

(a:\ ■ ■ ■ , a;”) is equal to one of the samples hi, ■ ■ , bk, and zero otherwise. 
Clearly, for any ij ivhich satisfies (2 5) we have f (i?) = f. The existence of such an 
ij can be shown as follows. With any finite set of positive integers ii, ■ • ■ , ir 
we associate an elementary event, say Ariii , ■ ■ ■ , ir). Let Ariii , • • , ir) 
denote the negation of the event Ar(ii , • • • , v)- Thus, we have a denumerable 
system of elementary events by letting r, ii, • ■ • , ir take any positive integral 
values. We shall assume that the events Ai(l), Ai(2), • ■ ■ , ad inf. are inde¬ 
pendent and the probability that Ai{i) happens is equal Zi{i). We shall now 
define the conditional probability of . 42 ( 1 , j) knowing for any k whether Ai{k) 
or Ai(k) happened. If 4.i(i) happened, the conditional probability of 42 ( 1 :, j) = 
Zi{i, j) and 0 otherwise. The conditional probability of the joint event that 

4 . 2(11 , jl), 42(4 , jt), ■ ■■ , Assiir , Jr), Aj{ir +1 , jr+l), • • • , and Aiiir+, , jr+.) will 

happen is the product of the conditional probabilities of each of these events 
(knowing for each i whether 4i(i) or 4i(i) happened). Similarly, the condi¬ 
tional probability (knowing for any i and for any (i, j), whether the correspond¬ 
ing event 42 ( 1 , j) happened or not) that 4j(ii, ji, ki) and 43 ( 12 , jt, h) and 
■ • ■ ASr 1 Jr, hr) aud 43(tV+i, jr+i, fcr+i) and • • • and ASrW-, jV+., hr+,) will 
simultaneously happen is equal to the product of the conditional probabilities 
of each of them. The conditional probability of Ai{i, j, k) is equal to 23 ( 1 , j, k) 
if 4i(i) and 42 ( 1 , j) happened, and zero otherwise; and so on. Clearly, this 
system of probabilities is consistent. 

If we interpret 4r(ii, ■ ■ • , ir) as the event that the decision function SD = 
d(x) selected by the statistician has the property that n(x; ®) > r when == 
ii, ■ • ■ , x' = ir, the above defined system of probabilities for the denumerable 
sequence {4r(fi, ••• , fr)} of events implies the validity of (2.6) for D* = 
D(j = 1, • ■ • , fc). The consistency of the formula (2.5) for D* = D implies, 
as can easily be verified, the consistency of (2.6) also in the general case when 
D* 9^ D. 

Let fi be given by the sequences of {z„i(x^, • ■ ■ , x”)} and (3*i.(m = 
1, 2, • ■ • , ad inf.). Let, furthermore, f be given by , *")} and 

We shall say that 

(2.6) lim = f 



178 


ABRAHAM WALD 


if for any m, • • • , we have 

(2.7) lim • • • , a:"') = sM , * • ‘ , ®'") 

tnep 

and 

(2.8) lim 5*i..,im,i(Z)*) = ..*™(f3*) 

*-•00 

for any open subset D* of D whose boundary has probability measure zero ac¬ 
cording to the limit probability measure 3,i. 

In addition to Condition 2.4, we shall impose the following continuity con¬ 
dition on the coat function. 

Condition 2.5. 7/ 

lim iim) = tM, 


then 

lim j c(x^, • ■ • , a:'", 5D) dip = j c(x'^, • ■ • , aj", 35) dij, 

where Q(x^, • ■ • , z”') is the class of all decision functions 3) for which n(t/, 3)) " 
mif = z^, ••• ,y”' => x”. 

2.4. The main theorem. In this section we shall show that the statistical 
decision problem, viewed as a zero sum two person game, is strictly determined. 
It will be shown in subsequent sections that this basic theorem haa many im¬ 
portant consequences for the theory of statistical decision functions. A precise 
formylation of the theorem is as follows: 

Theorem 2.1. If Conditions 2.1-2.6 are fulfilled, the decision 'problem, viewed 
as a zero sum two 'person game, is strictly determined, i.e., 

(2.9) Sup Inf r((, ij) = Inf Sup r(f, tj). 

t 1 it 

To prove the above theorem, we shall first derive several lemmas. 

Lemma 2.1. For any e > 0, there exists a positive integer m ,, depending only 

on e, such that the value of Sup Inf r ({, rf), is not changed by more than e if we re- 

t 1 

strict the choice of the statistician to decision functions d(x) for which n{x) S mt 
for all X. 

Pbooe: Put Wo = Sup W(F, d) and choose m, so that 

r.D 

(2.10) c(a:\ 3)) > 

« 

identically in a: , • • ■ , x"" and 3) for all m ^ m,. The existence of such a value 
m, follows from Condition 2.4. Consider the function Inf r({, 3)), Our lemma 

SD 

is proved, if we can show that for any f, the value of Inf r({, 3)) is not increased 

Xl 



STATISTICAL DECISION FUNCTIONS 


179 


by more than e if we restrict ® to be such that n(x, ®) < m, for all x. The latter 
statement is proved, if we can show that for any decision function = di(z) 
we can find another decision function = di(x) such that n{x, 5Da) < m, for 
all X and r(f, © 3 ) < r(f, © 1 ) + «. There are two cases to be considered; (a) 
prob ln(X, ®i) > m, 1 f J ^ e/Wo and (b) prob {n(X, ©;) > m, | f} < e/Wo • 
In case (a) we have r(f, ®i) ^ TFo • In this case we can choose ©a to be the rule 
that we decide for some element do of D without taking any observations. 
Clearly, for this choice of ©a we shall have r(f, © 2 ) < r(f, ®i). In case (b) 
we choose ©2 as follows: 

diCx) = di(x) whenever n(a:, © 1 ) ^ m,; 

ds(x) = do whenever n(x, ©i) > m,, 

where do is an arbitrary element of D. Thus, n(x, © 2 ) ^ m, for aU x. Since 
prob in(x, ©j) > m, | {) < e/PTo, it is clear that r(f, © 2 ) g r(f, ©x) + e. Hence 
our lemma is proved. 

Let Q” denote the class of decision functions ©for which n(x; ©) ^ m for 
all X. For any positive «, let Q”'‘ denote the class of all decision functions 
which satisfy the following two conditions simultaneously; ( 1 ) n(a:, ©) ^ m for 
all x; ( 2 ) d(x) is an element of where D*i,, denotes the subset of 2) having 
the properties stated in Condition 2.3. Clearly, Q”’’* C Q”. A probability 
measure v will be denoted by f/” if n(Q’^) = h and by if - 1 . 

Lemma 2.2. The following inequality holds: 

(2.11) Sup Inf r(f, ij”) ^ Sup Inf r({, 1 ?”'*) g Sup Inf r(f, ij”') + e Wo, 

t I?"* { I"'* t I" 

where Wo is ou upper bound of W(F, d). 

Proof: The first half of (2.11) is obvious. If we replace the subscript ** by 
the chance variable the set Uxi,, defined in Condition 2.3 will be a random 
subset of S2. It follows easily from the definition of that 

( 2 . 12 ) prob If] S 1 — 

With any decision function © == d(x) we shall associate another decision func¬ 
tion ®* — d*(x) such that n(x, ©) = n(x, ©*); d*(x) = d(x) whenever d(a:) « 
D*i,, ; and d*(x) is an element of H*i,, that is uniformly better than d(x) 
relative to ,, whenever d(s) ^ DZ . It follows from ( 2 . 12 ) and the fact that 
Wo is an upper bound of W(F, d) that 

(2.13) r(F, ©*) g r(F, ©) + e Wo- 

The second half of (2.11) is an immediate consequence of (2.13) and our lemma 
is proved. 

Lemma 2.3. The equation 

(2.14) Sup Inf r({, -q”"’) = Inf Sup r(f, 27 ”'') 
holds for all m and e. 



180 


ABHAHAM WALD 


Phoof. For any positive integral values m, k and for any p > 0, let n'”’*''' be 
the class of all elements F of 12 for which 

prob ^ k and a:“ g A and • • • a:" ^ ^ 1 — p. 

A probability measure { for which {(fl"'*’') = 1 will be denoted by fTo 
prove (2.14), we shall first prove the inequality 

(2.15) I Sup Inf r(r'*'^ n”''‘) - ^ Sup r,"“) I ^ + CJ) 

^m,k,p ^m,c 

where C„ is an upper bound of Cix^, S)) for all r g to, a:\ • • * , aj' and 5D. 

Since for any d(a:) in Q”*'*, d(a;) must be an element of D*i,, and since D*t,, 
is compact, it is sufficient to prove the validity of (2.14) in the case when Dti,, 
is a finite set Thus, we shall assume in the remainder of the proof that D^i,, 
is finite 

Let 5 be a given positive number and let Q'"'*''* be a finite subset of Q™'* satis- 
f 3 dng the following condition; for any element S> = d(a:) in Q”'‘ there exists an 
element S)* = d*(a;) in Q"'*'* auch that 


d*(a!) = d(a:) and | C(x, ®*) - C(x, 3» 1 g 5 

for all X for which x^ ^ k, x^ ^ k, - - • , and x" ^ k. Clearly, for any choice of 
S there exists a finite subset Q"'*'* of Q”'' with the desired property. For any 
35 in Q"’’, we can then find an element 35* in Q"*'*’" such that 


r{F, 3)*) ^ r(F, 3)) + p(Fo + C„) + 5, 
for all F in fl"'*’'. From this it follows that 



where = l. Since ig finite, we have 

(2-18) Sup Inf r = Inf Sup r. 

Inequality (2.15) follows from (2.16), (2.17) and (2.18) and the fact that i 
can be chosen arbitrarily small. 

Lemmas 1.1., 1.3 and the inequality (2.16) imply that Lemma 2.3 must hold 
if 


(2.19) 


lim Inf Sup r = Inf Sup r 


of°^( 2 ^i 9 )'^'^^’ 35emma 2.3 is completed if we can diow the validity 

Let (ij?'*} (fc = 1, 2, ■ • • , ad inf.) be a sequence of randomized decision func¬ 
tions such that 

(2.20) Im [Sup ^J"*) - Inf Sup r(r'*’', *}"'*)] = 0 



STATISTICAL DECISION FUNCTIONS 


181 


Let fj, = '*) (see definition in Section 3.2) and let ft be given by the two 

sequences of functions {zr*(a;S , afO) (5ii.••*',*) = !» 2, • • • , m). 

Since there are only countably many samples (x\ • • • , x') (r ^ m), there exists 
a subsequence {lb'} of the sequence {/c} such that 

(2.21) lim z,.ti(a:\ • • • , x*") = ^^(a:^ • • • , x") 

fc—OO 

and 


(2.22) lim ..*r 

for all r and all samples (x', * • ■ , x'). Let rff'* be a randomized decision func¬ 
tion such that f(ijo'*) is equal to the f defined by { 2 r(x', ■ • • , x')] and {5*i...ir} 
(r = 1, 2, ■ ■ • , m). 

For any element F of n and for any v > 0, there exists a finite subset M, of 
the m-dimensional sample space such that the probability (under F) that the 
sample (x', • • • , x"*) will fall in M, is ^ 1 — v. From this and the continuity 
of the cost function (Condition 2.5) it follows that 

(2.23) lim r(F, ntO = riF, Vo'l for all F. 

A—« 

Clearly, 

(2.24) Sup r(f’"'*'", n) = Sup r(F"'*’‘'’, v) 

where F'"'*''’ is an element of 12'"'*’''’. Hence 

(2.26) Inf Sup »?’"■•) = Inf Sup r(F'"'*’'", v’"'')- 

Since any F in 12 is contained in fi"" for sufficiently large k, it follows from (2.20) 
and (2.25) that 

(2.26) lim r(F,,??.'•) g lim {Inf Sup /•(F”'*'*’, »;’"••)}. 


Hence, because of (2.23), 

(2.27) r(F, nT*) ^ lim (Inf Sup r(F"'*’'", ^"‘■')}. 

Thus, 

(228) Inf Sup rCF, r;’"'*) ^ Urn {Inf Sup r(F"'*''’, t/”'*)}. 

Since the left hand member of (2,28) cannot be smaller than the right hand 
member, the equality sigh must hold. This concludes the proof of Lemma 2.3. 

Theorem 2.1 can easily be proved with the help of lemmas 2.1, 2.2 and 2.3. 
From Lemma 2.2 it follows that 


(2.29) 


lim Sup Inf r = Sup Inf r. 

I 



182 


ABHAHAM -WALD 


From this and Lemma 2.3 we obtain 


(230) 

hm Inf Sup r = Sup Inf r. 

But 

lim Inf Sup r S Inf Sup r. 

Hence 


(2.31) 

Inf Sup r ^ Sup Inf r. 

1’" ( f 

Hence, because of Lemma 1.3, we then must have 

(2.32) 

Sup Inf r = Inf Sup r. 

{o'" 0 " { 

It follows from Lemma 2.1 that 

(2.33) 

lim Sup Inf r = Sup Inf r. 

m— 90 f 9 “ E 

Hence, because of (2.32), we have 

(2.34) 

lim Inf Sup r = Sup Inf r. 

m-« 0" { { 1 

But 


(2.36) 

lim Inf Sup r ^ Inf Sup r. 

m—00 

Hence 



(2.36) Inf Sup r g Sup Inf r 

^ E t ? 

Theorem 2.1 is an immediate consequence of (2.36) and Lemma 1.8. 

2.5. Theorems on complete classes of decision functions and minimax solutions. 
For any positive « we shall say that the randomized decision function r;o is an 
e-Bayes solution relative to the a priori distribution f if 

(2.37) r({, i/o) ^ Inf r((, n) -f e. 

If 1/0 satisfies (2,37) for « = 0, we shall say that i/o is a Bayes solution relative 
to 

A randomized decision rule m is said to be uniformly better than m if 

(2.38) r{F, T/i) g r{F, tjj) for all F 
and if 


^( 7 ^> Vi) < r(F, T/j) at least for one F. 

A class C of randomized decision fimctions ij is said to be complete if for any 
V not in C we can find an element v* in C such that i/* is uniformly better than ij. 



STATISTICAL DECISION EUNCTIONS 


183 


Theobbm 2.2. If Conditions 2.1-2.5 are fulfilled, then for any e > 0 the class 
€, of all e-Bayes solutions corresponding to all possible a prion distributions f 
is a complete class. 

Proof: Let tjo be a randomized decision function that is not an «-Bayes solu¬ 
tion relative to any f. That is, 

{2.40) r(f, i]q) > Inf r(i, v) + ^ for all f. 

If r(F, jj») — for all F, then there is evidently an element of C, that is uni¬ 
formly better than rjo. Thus, we can restrict ourselves to the case where 

(2.41) r(F, va) < «> at least for one F. 

Put 


(2.42) W*(F, d) = W(,F, d) - r{F, r,o) 

and let r* (f, ij) denote the risk when W(jF, d) is replaced by W*(,F, d). Then 


<2.43) r*(i, r,) = r(«, i?) - r(J, r,f). 

Let Q” denote the class of all decision functions d(x) for which n(x) ^ m 
identically in x. Furthermore, denote any rj for which i?(Q'") = 1 by t)’". We 
shall first prove the following relation. 

(244) Sup Inf r*(^, ijT) = Inf Sup r*(f, rj") 

f { 

for any positive integral value m. For any positive constant c, let denote the 
class of all elements F for which r{F, i/o) ^ c. 

Clearly, Conditions 2.1-2.5 remain valid if we replace W{F, d) by W*{F, d) 
and 0 by n, where c is restricted to values for which fio is not empty. Hence, 
Theorem 2.1 can be applied and we obtain 


(2.45) 


Sup Inf r*{e, v”) = Inf Sup r*(f“, n'), 

,m JO 


where ^ denotes any f for which = 1. Let h and w be two positive values 
for which 

(2.46) Sup Inf r*(f', tj") g —h for all c 

JO 


and 


(2.47) 


^(F, n”') ^ 'io for all F and all ij". 


Clearly, such two constants h and u exist. From (2.46) and Lemma 1.3 we ob¬ 
tain 


(2.48) 

Since 


Inf Sup f* (^, jj”) ^ —h. 

<?"• f 


(2.49) 


r*{F, I)”') < —{h-\-S) for any F not in ilA+8+„(5 > 0), 



184 


ABBAHAM WABD 


it follows from (2 48) that 

(2.50) Inf Sup r* = Inf Sup r* for all c > A 4- ui. 

From (2.45) and (2.50) we obtain 

(2.51) Sup Inf r* = Inf Sup r* for all c > h + w. 

,n. ,1.1 f 

Hence, 

(2.51a) Sup Inf r* S Inf Sup r*. 

Si" t 

Because of Lemma 1.3, the equality sign must hold and (2.44) is proved. 
Since in is not an element of C, , we must have 

(2.52) Inf r(l 17 ) < rQ, vo) ~ e. 

V 

From this it follows that 

(2.53) Inf r*(|, 77 ) ^ -t. 

Hence 

(2 54) Sup Inf r*(f, 77 ) g — e. 

f 1 

It Avas shoAvn in the proof of Lemma 2.1 that for any p > 0 there exists a 
positive integer nip , depending only on p, such that 

(2.55) Inf rd, 77 '"') g Inf r({, i?) + P for all 

If 

From (2.44), (2.54) and (2.55) it follows that there exists a positive integer 
nio , namely mo = , such that 

(2.56) Inf Sup r*(f, 77 ”) < —^ for any m ^ tUo . 

From (2,44) and (2.66) it follows that there exists an a priori distribution 
and an e-Bayes solution 77 T relative to fi such that 

(2.57) r*{F, 77 ?) g - i for all F. 

4 

Hence, because of (2.43), 

(2-®®) ^(F, 77?) g r{F, 770 ) — for all F, 

4 

and Theorem 2.2 is proved. 

Theorem 2.3. If D is compact, and if Conditions 2.1, 2.2, 2.4, 2.6 are fulled, 
then there ejnsts a minimax solution, i.e., a decision rule rjo for which 



STATISTICAL DECISION FDNCTIONS 


185 


(2.59) Sup r{F, ijo) g Sup r(F, ij) for all ij. 

To prove the above theorem, we shall first prove the following lemma. 

Lemma 2.4. If D is compact andif Conditions 2.1, 2,2, 2.4, 2.5 are fulfilled, 
then for any sequence {tji) (f = \,2, • • • ,ad inf.) of randomized decision functions 
for which r(F, is a bounded function of F and i, there esrists a subsequence (r/i,) 
(j = 1, 2, ad inf.) and a randomized decision function r)a such that 

(2.60) lim inf r(f, ri,f) ^ r($, ijo) for all 

1-0 

Peoop: Let = f(i;,) (defined in Section 2.3) be given by [z,,i{x^, • • • , a;’’)) 
and {5ii*a (r = 1, 2, ■ • • , ad inf,). Thus, , a;’’) is the con¬ 

ditional probability that we shall take an observation on using the rule 

and knowing that the first r observations are given by a:', • ■ • , / and that ex¬ 
perimentation was not terminated for {x, • • ■ ,x^) (k < r). As stated in section 
2.3, for any r, x^, • • • , x' the symbol 5ai...,r,» denotes the conditional probability 
distribution of the selected d when m is used and is known that the first r ob¬ 
servations are equal to x^, • • , x' and that n{x) = r. Since there are only count¬ 
ably many finite samples (ic\ • • • , x'), it is possible to find a subsequence (i,) of 
{f! such that lim 2 r,,y(a:\ ■ • • , x') and lim exist ^ Put 

(2.61) lim2r,*/xS ,x') = Zr,o(x\ ,x') 

and 

(2.62) lim • 

As shown in section 2.3, there exists a randomized decision function tjo such 
fo = is given by ( 2 r,o(x\ • • • , x*)} and *r,o). Let g,,,(x\ ■ • • , x'" | ^) 
denote the probability that the sample (x\ • • • , x’’) will be obtained and that 
experimentation will be stopped at the r-th observation when f is the a priori 
distribution and Vt is the decision rule used by the statistician. For any sample 
(x\ • ■ • , x^) let 12,(®\ ■ • ■ , x’’) denote the expected value of TF(F, d) when the 
distribution of F is equal to the a posteriori distribution of F as implied by | 
and (x\ • ■ • , x’^) and where d is a chance variable independent of F with the 
probability distribution 5»i...»f,,. Since, r(t, ri<) is bounded by assumption, 
the probability that experimentation will go on indefinitely is equal to zero. 
From this it follows that 

(2.63) E gr,.(x‘, ■ • -, x' 1 f) = 1 for all f. 

r,»rrr.,»r 

•The existence of hm follows from the compactness of D (see Theorem 3.6 

in [3]). 



186 


ABRAHAM WALD 


Then r(^, ij,) is given by 

f c(x\ • • • , e", SD) din 
R,(x^, ■ • ■, x') H—--—--■ 

/ 

•’On**..*'’ 

where ,*r is the totality of all decision functions d(x) for which n(^) ■« r 
whenever = x^, , y' = x'. Clearly, 

(2.65) hm qr,ij(x\ ■. • , a:"| {) = gr.„(x\ • ■ ■ , x" |{). 

] M4d 

Since D is compact and since W{F, d) is a continuous function of d uniformly 
in F (in the sense of the metric defined in D), we have 

(2.66) lim Ei,(x^, ,x') = Ro{x\ • • ■ , «')• 

y««o 

From Condition 2.5 it follows that 

f ‘ ® I ®) driif f c(e , • • • , E ^rt) dijD 

(2 67) lim idylls' - == . . .^ 

/ di,<, f dyo 

Lemma 2.4 is an immediate consequence of the equations (2.04) — (2.67). 
We are now in a position to prove Theorem 2.3. Because of Theorem 2.1 
there exists a sequence {rjil such that 

(2-68) lim Sup r(F, ij.) = Inf Sup r(F, y). 

!-•> r n f 

According to Lemma 2,4 there exists a subsequence {t;,',} (? = 1, 2, • • • , ad inf.) 
and a randomized decision function yo such that 

(2.69) lim inf r{F, i?,.) ^ r(F, yo) for all F. 

J-QO 

It follows from (2 68) and (2.69) that yo is a mlnimax ^.olution and Theorem 
2.3 is proved. 

Theorem 2.4. Tf D is compact and if Condihons 2.1, 2.2, 2.4, 2.5 are fulfilled, 
then for any { there exists a Bayes solution relative to £. 

This theorem is an immediate consequence of Lemma 2.4. 

We shall say that yo is a Bayes solution in the wide sense, if there exists a 
sequence {£,) (t = 1^ 2, • • > , ad inf.) such that 

(^•^0) lira , Vo) - Inf r(£(, ij)] = 0. 

^ solution in the strict sense, if there o.vists a t 
such that yo is a Bayes solution relative to‘£. 


<2.64) 


= E 


1 ' ' ' ) If) 


,xr 



STATISTICAL DECISION FUNCTIONS 


187 


Theorem 2.5. If D is compact and Conditions 2.1-2.5 hold, then the class of all 
Bayes solutions in the wide sense is a complete class. 

Proof: Let ijo be a decision rule that is not a Bayes solution in the wide sense. 
Consider the weight function W*{F, d) = TF(P, d) — r(F, jjo). We may assume 
that riF, rjo) < “ for at least some F, since otherwise there obviously exists a 
Bayes solution in the wide sense that is uniformly better than rjo . Then it 
follows easily from (2.44) and Lemmas 2.1 and 1.3 that 

(2.71) Sup Inf r*(f, ij) = Inf Sup r*(f, n) = v* (say), 

i 1 « { 

where r*(t, rj) is the ride corresponding to W*{F, d), i.e., 

(2.72) r*(f, 1 ,) = r(f, i,) - r(f, ijd). 

Theorem 2.3 is clearly applicable to the risk function r*($, ij). Then, there 
exists a minimax solution m for the problem correspondmg to the new weight 
function W*{F, d). Since, because of 2.72, d* ^ 0, we have 

(2.73) vi) = r({, vi) - r((, „o) g 0 for all 

Our theorem is proved, if we can show that vi is a Bayes solution in the wide 
sense. Lot {{<) (i — 1, 2, • ■ • , ad inf.) be a sequence of a priori distributions 
such that 

(2.74) lim Inf r*((,, ij) = ti*. 

Since is a mmimax solution, we must have 
(2.76) r*(ii, m) ^ v*. 

It follows from (2.74) and (2.75) that iji is a Bayes solution in the wide sense 
and our theorem is proved. 

We shall now formulate an additional condition which will permit the deriva¬ 
tion of some stronger theorems. First, we shall give a convergence definition 
in the space il. We shall say that F, converges to F in the ordinary sense if 

(2.76) lim pr(x\ ■ • • , a:'! Fi) = Pr(x\ • • • , a;’’ jF) (r = 1, 2, • • • , ad inf.). 

Here pr(x^, ■ ■ • , | F) denotes the probability, imder F, that the first r observa¬ 

tions will be equal to • • ■ , respectively. We shall say that a subset w 
of is compact in the ordinary sense, if « is compact in the sense of the conver¬ 
gence definition (2.76). 

Condition 2,6. The space SI is compact in the ordinary sense. If Fi con¬ 
verges to F, as i-^ », in the ordinary sense, then 

lim W(F,, d) = W(F, d) 

i‘*ee 

uniformly in d. 

Theorem 2 .6. If D is compact and if Conditions 2 1, 2.2, 2.4, 2.5, 2.6 hold, 
then: 



188 


ABRAHAM WAHD 


(i) there exists a least favorable a priori dislribidion, i.e., an a priori distribution 
£o for which 


Inf r(|o, ij) = Sup Inf r(f, v)- 
1 { 1 


(ii) A minimax solution exists and any rmnimax solution is a Bayes solution 
in the strict sense. 

(iii) If ijo is a decision rule which is not a Bayes solution in the strict sense and 
for which r(F, yo) is a hounded function of F, then there exists a decision rule 
which is a Bayes solution in the strict sense and is uniformly better than jjo ■ 

Proof: Let {f.j (f = 1, 2, ■ ■ ■ , ad inf.) be a sequence of a priori distributions 
such that 


(2.77) lim Inf r(|i, n) = Sup Inf r({, y). 

Since n is compact in the ordinary sense, there exists an a priori distribution 
fo and a subsequence or (f,) such that 

(2.78) lim = ^(e>) 

for any subset w of n which is open (in the sense of the ordinary convergence 
definition infl) and for which {o(«*) = 0, where w* denotes the set of all boundary 
points of w. We shall show that Jo is a least favorable distribution. Assume 
that it is not. Then there exists a decision function S)o = do(x) such that 

(2.79) r({o, 3)o) S t) — 6, 


where 5 > 0 and v denotes the common value of Sup Inf r and Inf Sup r. It was 

t f I ( 

shown in the proof of Lemma 2,1 that (2,79) implies the existence of a decision 
function = di{x) and that of a positive integer m such, that 

(2-80) n{x-, T)j) g m for all x 

and 


(2-81) r(io, SDi) g V — 

Since c{x, ■ ■ • , x", !Bi) and Tr(F, d) are uniformly bounded and W{F, d) is 
continuous in F uniformly in d, we have 

(2.82) lim r (Fi, SDi) ■= r(F. SDO 

00 

for any sequence {P,} for which Fi —y F ia the ordinary sense. From (2.78), 

(2.82) and the compactness of Q (in the ordinary sense) it follows that 


limr (£i*, ®:) = r(Jo, ®i) ^ 1) - 

j—« 2 


(2.83) 



STATISTICAL DECISION FUNCTIONS 


189 


But this is in contradiction to (2.77) and, therefore, Jo must be a least favorable 
distribution. Hence, statement (i) of our theorem is proved. 

Statement (ii) is an immediate consequence of Theorems (2.1), (2.3) and state¬ 
ment (i) of Theorem (2.6). 

To prove (iii), replace the weight function W{F, d) by TF''‘(F, d) = 
W(F, d) — r(F, •no) where no satisfies the conditions imposed on it in (iii). 

We shall show that (i) remains valid also when W(F, d) is replaced by 
'W*iF, d). This is not clear, since W*{F, d) may not be continuous in F. First 
we shall prove that 

(2.84) lim inf r(j(, no) ^ r(Jo, no) 

for any sequence {J,-} for which J,' —i- Jo m the ordinary sense, i.e., for whicli 

(2.85) Imi j((w) = Jo'(co) 


for any open subset oj (open in the sense of ordinary convergence defined in £1) 
whose boundary has probability measure zero according to Jq'. For any sample 
a:^ • • ■ , let Q'r.(a;\ • • • , x^) denote the probability that the first r observations 
will be equal to a;', • • • , x', respectively, when is the a prion distribution. 
Clearly, 

(2 86) gn{x\ ■..,/)= f p^(x\ • • • , a:' I F) 

Since pr(x^, • • • , ( F) is a contmuous function of F, we have 

(2.87) lim qr,{x\ ■ ■ ■ , x') = gro{x\ ■■■ , /). 

1—00 


The function r(J, ^o) can be split into two parts, i.e,, r(J, no) = ri(f, no) + no) 
where n is the expected value of the loss ^(F, d) and r 2 is the expected cost of 
experimentation. Since W{F, d) is a bounded function of F and d, and since 
W (F, d) is continuous in F uniformly in d, we have 

(2-88) lim ri(j(, no) = n(Jo, no) 

for any sequence {J,-} which satisfies (2.85). To prove (2.84), we merely have 
to show that 

(2.89) lim inf r 2 (J,-, no) ^ rj(Jo, no)- 

But 


(2.90) r,i^i,no) 




c(x\ ■ • • ,x'-,'^)dno 


where Qji.. ir is the totality of all decision functions d(x) with the property 
that d(y) = r for any y whose first r coordinates are equal to , x’’, respec¬ 

tively. Equation (2.89) is an immediate consequence of (2.87) and (2.90). 
Hence, (2.84) is proved. 



190 


abhahaai tvald 


Let )■*(?, v) be the risk function when W(F, d) ia^replaced by TF*(/J’, d), i.e., 
f,) = r(f, tj) - r(f, tjo). Let, furthermore, [f?| be a sequence of a priori 
distributions such that 

(2.91) lim Inf r*(ii, v) = Sup Inf r*(f, >j). 

rj ( ^ 

There exists a subsequent {??,) of the sequence If?) such that f?, converges (in 
the ordinary sense) to a limit distribution (* as j oo. We shall show that 
(* is a least favorable distribution. For suppose that f? is not a least favorable 
distribution. Then there exists a decision function 5D? => d?(.T:) such that 

(2.92) S)?) ^ u* - S 

where 5 > 0 and v* = Sup Inf r* = Inf Sup r*. But then there exists a decision 
function 5)? = di (x) and a positive integer m such that 


(2 93) 

n{x] SD?) ^ mfor all x 

and 


(2 94) 

©?) ^ 

Since r*({, S?) = 

r(f, SD?) - r(f, jjo), and since 


limr(f?,,!D?) = Kf?,®?), 

J««oo 

it follows from (2.84) and (2.94) that 

(2.95) 

lim sup r*(f?,, ©?) S v* - 


which is in contradiction to (2.91). Hence, the validity of (i) is proved also 
when W(F, d) is replaced by W*(F, d). Clearly, also (ii) remains valid when 
W(F, d) is replaced by W*{F, d). 

Let iji be a minimax solution relative to the problem corresponding to 
W*(F, d). Then because of (ii), rji is a Bayes solution in the strict sense. 
Since r/o is not a Bayes solution in the strict sense, iji rjo and v* < 0. Hence 
1)1 is uniformly better than t/o . This completes the proof of Theorem 2.6. 

We shall now replace Condition 2.0 by the following weaker one. 

Condition 2.6*. There exists a sequence (O,) (i =■ 1, 2, • • • , ad t'u/O of 
subsets of n such that Condition 2.6 is fulfilled when fi is replaced hy Qi, Jl<+i "D fli 
and lim Jl, = 0. 

We shall say that ij, converges weakly to n as f -r oo, if lim {*(h<) “ 

We shall also say that t; is a weak limit of ij,. This limit definition seems to be 
natural, since r(f, i){) = r(f, i^j) if f (ijj) = {-(t/i). We shall now prove the follow¬ 
ing theorem: 



STATISTICAL DECISION FUNCTIONS 


191 


Theorem 2.7. If B is compact and if Conditions 2.1, 2 2, 2.4, 2.5 and 2.6* ate 
fulfilled, then: 

(i) A minimax solution exists that is a weak limit of a sequence of Bayes solu¬ 
tions in the strict sense, 

(ii) Let Tjo be a decision rule for which r{F, ijo) is a hounded function of F. Then 
there exists a decision rule m that is a weak limit of a sequence of Bayes solutions 
in the strict sense and such that r{F, m) ^ t{F, tjd) for all F in 9.. 

Proof: According to theorem 2 6 , there exists a decision rule m that is a Bayes 
solution in the strict sense and a minimax solution if Q is replaced by . There 
exists a subsequence (j = 1 , 2 , • • • , ad inf.) of the sequence {t/,} such that 
{iji;) admits a weak limit. Let ijo be a weak limit of . Then, as shown in 
the proof of Lemma 2.4, equation (2 GO) holds and ijo is a minimax solution rela¬ 
tive to the original space fi. Thus, statement (i) is proved. 

To prove (ii), replace W{F, d) by 'Pk*(F, d) = TT(F, d) — r{F, no). Accord¬ 
ing to Theorem 2.6 there exists a decision rule tju such that iji, is a minimax solu¬ 
tion and a Bayes solution in the strict sense when is replaced by 12 , and W(F, d) 
by W*{F, d) Clearly, vu remains to be a Bayes solution in the stihct sense also 
relative to 12 and WiF, d). Since yu is a minimax solution relative to 12 , and 
W*(F, d), we have 

(2.96) r{F, Tjii) ^ r(F, jjo) for all F in 12 ,. 

Let {tjhj} be a subsequence of the sequence { 771 ,) such that { 771 ,^} admits a weak 
limit 171 . Then, (2,60) holds for { 77 i(,} and 771 , and 

(2.97) r(F, 771 ) g r(F, 770 ) for all F in 12. 

Since 771 is a weak limit of strict Bayes solution, statement (ii) is proved. 

3. Statistical decision functions: the case of continuous chance variables. 

3.1. Introductory remarks. In this section we shall be concerned with the 
case where the probability distribution i*" of A is absolutely continuous, i.e., 
for any element of 12 and for any positive integer r there exists a jomt density 
function Pr(x\ - • • , x"” | F) of the first r chance variables X^, , X', 

The continuous case can immediately be reduced to the discrete case discussed 
in section 2 if the observations are not given exactly but only up to a finite num¬ 
ber of decimal places. More precisely, we mean this; For each i, let the real 
axis R be subdivided into a denumerable number of disjoint sots Ra , Rn , • • • , 
ad inf. Suppose that the obseiwed value x' of X' is not given exactly; it is merely 
luiown which element of the sequence (j = 1, 2, ••• , ad inf,) contains 
x'. This is the situation, for example, if the value of x' is given merely up to a 
finite number, say r, decimal places (r fixed, independent of i). This case can 
be reduced to the previously discussed discrete case, since we can regard the 
sets Ri, as our points, i.e, we can replace the chance variable X' by Y' where 
Y' can talce only the values i2,i, Ua, • ■ • , ad inf. (F‘ takes the value R^j if 
falls in Ri{). If W(Fi , d) = W{Fi, d) whenever the distribution of Y under 



192 


ABKAHAM WALD 


Fi is identical with that under , only the chance variables , etc 

play a role in the decision problem and we have the discrete case. If, the latter 
condition on the weight function is not fulfilled, i e., if there exists a pair (Fi , F 3 ) 
such that , d) 5 ^ ^V{F^, d) for some d and the distribution of K is the same 
under Fi as under Ft, we can still reduce the problem to the discrete case, if in 
the discrete case we permit the weight W to depend also on a third extraneous 
variable Q, i e., if we put W = W(F, G, d), where (? is a variable about whose 
value the sample does not give any information. The results obtained in the 
discrete case can easily be generalized to include the situation where W » 
WiF, G, d). 

In practical applications the observed value x’ of X* will usually be given 
only up to a certain number of decimal places and, thus, the problem can be 
reduced to the discrete case. Nevertheless, it seems desirable from the theo¬ 
retical point of view to develop the theory of the continuous ca.se, assuming 
that the observed value x' of X* is given precisely. 

In section 2.3 an alternative definition of a randomized decision rule was given 
in terms of two sequences of functions {z,(x\ • • ■ , x')l and (r » 1, 2, 

•• ,ad mf.). We used the symbol {"to denote a randomized decision rule given 
by two such sequences. It was shown in the discrete case that tho use of a 
randomized decision function ti generates a certain f =■ {-(tj), and tlmt for any 
given f there exists an tj such that f = tiv)- Furthermore, because of Condition 
2 5, in the discrete case we had r(F, »ri) = r(F, m) if fC’fi) ” It would be 

possible to develop a similar theory as to the relation between {■ and also in tho 
continuous case However, a somewhat different procedure will be followed for 
the sake of simplicity. Instead of the decision functions ci(x), we shall regard 
the f s as the pure strategies of the statistician, i.e., we replace the space Q of 
all decision functions d(x) by the space Z of all randomized decisions miles f. 
It will then be necessary to consider probability measures tj defined over an, 
additive class of subsets of Z, It will be sufficient, as will be seen later, to con¬ 
sider only discrete probability measures 1 ?. A probability measure ij is said to be 
discrete, if it assigns the probability 1 to some denumerable subset of Z, Any 
discrete r, will clearly generate a certain f In the next section we shall 

formulate some conditions which will imply that r(F, Vi) = r(,F, m) if ^( 1 ) 1 ) = 
fC’Js)' Thus, it wM be possible to restrict ourselves to consideration of pure 
strategies f which will cause considerable simplifications. 

The definitions of various notions given in the discrete case, such as minimax 
solution, Bayes solution, a priori distribution f in n, least favorable a priori dis- 
tnbution, complete class of decision functions, etc. can immediately be ex- 
tended to the contmuous case and will, therefore, not be restated here. 

3.2 Cf^thons on D, W(,F,d) and Ihe cost function. In this section we shall 
lormulate conditions similar to those given in the discrete case. 

Condition 3.1. Each element F of H is absolutely continuous. 

Condition 3 2. W(F,d) is a bounded function of F and d. 

CowiTioN 3.3. The space D is compact in the sense of its intrinsic metric 
, Oa; £2) (see eguatim 2.2), 



STA.TISTICAL DECISION FUNCTIONS 


193 


This condition is somewhat stronger than the corresponding Condition 2.3. 
While it may be possible to weaken this condition, it would make the proofs of 
certain theorems considerably more involved. 

Condition 3.4. The cost of experimentation c(a;\ • • ■ , x’") does not depend on 
f. It is non-negative and lim c{x^, ■ • ■ , a;”) = °o uniformly inx^, • ■ ■ , a:"*. For 

each positive integral value m, eix^, • • • , a:'") ts a bounded function of x^, • ■ • , x'". 

This condition is stronger than Conditions 2.4 and 2.5 postulated in the dis¬ 
crete case The reason for formulating a stronger condition here is that we wish 
the relation r(F, m) — r(F, m) lo be fulfilled whenever which will 

make it possible for us to eliminate the consideration of ij’s altogether. Since 
the f’s are regarded here as the pure strategies of the statistician, it is not clear 
what kind of dependence of the cost on f would be consistent with the require¬ 
ment that r(F, iji) = r(F, m) whenever f(7/i) = 

We shall say that i F in the ordinary sense, if for any positive integral 
value m 

lim [ pm{x^ ^ • y x”" \ Ff) dx^ ■ ■ ■ dx”" = f p„y(x^, • • • , x” I F) dx’’ • • • dx” 

uniformly in Sm where Sm is a subset of the m-dimensional sample space. 

Condition 3.5. The space fi! ^s separable in the sense of the above convergence 
definition.^ 

No such condition was formulated in the discrete case for the simple reason 
that in the discrete case is always separable in the sense of the convergence 
definition given in (2.76), 

3.3. Some lemmas. Wo shall first give a convergence definition in the space 
Z of all f’s which is somewhat different from the one given in the discrete case. 
Let hr(x^y x^y • • • y x'^, D*) denote the probability that experimentation ivill be 
terminated with the rth observation and that the final decision d selected will 
be an element of D*, knowing that the first r observations are equal to • • • , x', 
respectively. That is, 

^ hrix^ , • • ■ , a;'", D*) = 21 ( 2 :^) 22 ( 31 ^, x^) 

■ ■ - Zr-i(x^ , - , X^-^)il - Zr{x^ , • • ■ , ...AD*). 

Clearly, the functions hrix'’, • • ■ , 3 ;’’, D*) are non-negative and satisfy the follow¬ 
ing conditions: 

(3.2) 53 hrix^y * • ■ yx'", D*) S 1 for any D* and for any sample (A, ■ ■ ■ , a:"). 

r-»l 

(3.3) Z hr{x^ J>r) = KiA , • • • , 2)*), 

j-i 

00 

if — D* and , D* , • • • , etc. are disjoint. 

1-1 _ 

* I'or a definition of a separable space, see F. Hausdorfi, Mengenlehre (3rd edition), p. 125. 



194 


ABIIAHAM WALD 


One can easily verify that for tiny sequence' of non-negative fun(3tiona 
{hr(x\ ■ ■ , X, Z)*)l (r = 1, 2, ■ ■ •) satisfying (3.2) and (3 3) there exists exactly 
one sequence {srC'c^, •• , ■'bO) and one sequence {Sii .ir (D*)l such that (3.1) 
is fulfilled. Thus, a randomized decision rule f can bo given by a sequence 
{hrix\ ■■■ ,X', D *)} satisfying (3.2) and (3.3). The (unctions Zr(x\ • • ■ , a;’’) and 
Sii . jr need be defined only for samples s'' for which Zi(x^, < ■ • , x^) > Q 

for f = 1, • • ■ , r — 1, The above mentioned uniqueness of Zrix'-, ■ • • ,x’) 
and 8 xi. was meant to hold if the definition of these functions is restricted 
to such samples xS • ■ •, x', 

Tor any bounded subset S, of the r-dimensional sample space, let 
(3 4) HriSr ,D*) = f hr(x", ,x\ D*) dx^ - dx'. 

Ja, 

Let {fi)(f = 0, 1, 2, ■ • ■ , ad inf.) be a sequence of decision rules, 
and ifr,(»Sr, D*) be the function 7/r(>Sr, D*) corresponding to . We shall 
say that 

(3.5) lim fi = ffl 

{-•06 

if 

(3.6) lim HrASr , D*) « HM , !>*) 

I—00 

for any r, any bounded set S, and for any D* that is an element of a sequence 
..k|l (fc) = 1, • • ■ , r, ; i = 1, • • • , i; f a 1, 2, • - • , ad inf.) of subsets 
of D satisfying the following conditions: 

2 77*, = 77; 2 77*,...*, = , 

hi 

(3-8) 7)*,, • • • , 77*,...t,_,r, arc disjoint, 

and 

(3.9) the diameter of 17*,..,*, converges to zero as f—> » uniformly in /q , • ■ • , ki. 

Lemi^ 3.1. For any sequence {(■<}(( = 1, 2, • • • , ad mf.) of deemon rales 
there exists a subsequence {f.^} (7 = 1, 2, • • • , od inf.) and a decision rule fo Such 
that lim fq = fo. 

j~to 

Proof: Let Hf,i{ST, D*) (r = 1,2, • • • , ad inf.) he the sequence of functions 
associated with fi. Let, furthermore, (77*,,..*,) be a sequenec of subsets of 77 
sati^ymg the relations (3.7), (3.8) and (3.9). Clearly, for any fixed r and any 
fixed element 7)*,...*, of the sequence {77?,...*,}, it is possible to find a subse¬ 
quence [ij] (j = 1, 2, • , ad inf.) of the sequence ((} (the subsequonco [ij] 
may depend on r and 7)*,. ,*,) and a set function 77,,o(Sr) such that 

, 77?,...*,) = i7,,o(Sr). 

Using the well known diagonal procedure, it is therefore possible to find a fixed 



STATISTICAL DECISION FUNCTIONS 


195 


subsequence {i,} (independent of r and D*) and a sequence of set functions 
{Hr,o(.Sr , such that 

(3.11) lim Hr.,,{Sr, Dt, = Zr.,o(.S,, Dt, . i,) 

J-OO 

for all values of r, /ci, ■ • • , ki and 1. 

To complete the proof of Lemma 3.1, it remains to be shown that 
there exists a decision rule fo such that the associated function IIr{Sr, D*) is 
equal to HT,o{Sr, D*) for any D* that is an element of Since 

hr,i{x^, • ■ • , x’’, D*) is uniformly bounded, the set function Hr.o{Sr , Dt^ ..i,) is 
absolutely continuous. Hence for any values of fci, • ■ ■ , ki there exists a func¬ 
tion hrfi{x, ■ • ■ , x', Z)fci...ib,) such that 


(3.12) f hr.a(X^ , ■ • ■ , , Dt,...ki)dx^ • • • dx' = Hr,o(Sr , 

JSr 

The existence of a fo with the desired property is proved, if we show that the 
functions K.oix^, ■ ■ ■ , x””, D*,. i,) satisfy the relations (3.2) and (3.3). Let 
/if(x\ • • • , x”, D*) = h,{x\ • • • , x', D*) for any w > r. Then, since the func¬ 
tions hr,, satisfy (3.2), we have 

m 

(3.13) 'LHr.,( 8 r.,D*) ^ nSm) 

r -1 

where F(Sn,) denotes the m-dimensional Lebesgue measure of Sm - From (3.13) 
it follows that 


(3.14) Z H.,o(<S™, 2)** ...*,) g ViSJ. 

Hence,the functions /ir,o(x\ ■ ,x', must satisfy (3.2) except perhaps 

on a set of Lebesgue measure zero. Since the functions K.tix'', • • • , x'', Z>*) satisfy 
(3.3), we must have 


(3.15) 


Hr,,{Sr,D\,...^,_,) = i: 

* 1-1 


Hence, the same relation must hold also for Ur.oiSr, But this implies 

that the functions hr,a{x^, • ■ ■ , x', D*j...ii,) satisfy (3.3) except perhaps on a set 
of Lebesgue measure zero, and the proof of I^emma 3,1 is completed. 

Lemma 3.2. Let Ti{S) (i = Q, i, 2, •”) be a non-negative, completely additive 
set function defined for all measurable subsets S of the r-dimensional sample space 
Mr. Assume that 


(3.16) Ti{S) ^ F(S) 

for all iS (i = 0, 1, 2, • • ■ , ad inf.) where V{S) denotes the Lebesgue measure of S. 
Let, furthermore, p(x\ • ■ • ,x^)bea non-negative function such that 

^ g{x'', • • • , x’^) dx^ • • • dx' < CO, 

Ur 


(3.17) 



ABRAHAM -WALH 


196 

If 

^3.18) Ti^S) ~ ToiS) 

then 

(3.19) lim f g(x^ ,■••,/) dT, = /* ffC®', • • * , ^') dTo ■ 

i-« •IMr 

Phooi': Let Mr,c be the sphere in Mr with center at the origin and ratlins c. 
Clearly, 

(3.21) lim [ gix'’, • ■ • , x') dx^ ■ • • dx' — f gix'’, ,x) dz dz\ 

(-.00 JMr^e 

Hence, because of (3.16), we have 

(3.21) lim r [ g(x^, ■ ■ ■ , x') dT, - f g{x'~ ,■■■ ,x') dJ’tl = 0 
uniformly in i* Hence our lemma, is proved if wc show that 

(3.22) lim f g(x^, ■ ■ ■ , x") d'A = f g{x'‘, ,x") dTt> 

for any finite o. Let ffi(a:\ ' • * , ■'s'^) = ‘ , »'^) when g(z^ ‘ ,x') ^ d., 

and = 0 otherwise. Since 

lim / ig - g^) dx^ ' - ‘ dx^ = 0 

^-.co Jjtr.o 

it follows from (3,16) that 

(3.23) lim f (g - gjd dTi = Q 
uniformly in i. Hence, our lemma is proved if we can sliow that 

(3.24) lim [ gA dT, = f gA dTo 

l-«J ^^r,c 

for any c > 0 and any d > 0. Let S,- denote the set of all points in ilfr,( for 
which 

(3.25) O' - 1) « S < i e 
where e is a given positive number. Wo have 

(3.26) E (i - 1) e f dTr ^ f gAdTt^Zjil dTi, {i =» 0,1, 2, • • ■)• 

i Jsj Jm,,. i Jsj 

Since for any e, j can take only a finite number of values, and since e can be 
chosen arbitrarily small, our lemma follows easily from (3.18) and (3.26). 

Lemma 3 3. Let {^t] be a sequence of decision ndes such ihcd lim = fo and 



STATISTICAL DECISION FUNCTIONS 


197 


r{F, f,) is a hounded function of F and i (i k 1). Then 

(3.27) lim inf r{F, f.) ^ r(,F, fo). 

t —80 

Proof: First we shall show that it is sufficient to prove Lemma 3.3 for any 
finite space D. For this purpose, assume that Lemma 3 3 is true for any finite 
decision space, but there exists a non-finite compact decision space D and a 

sequence (f,} such that lim f, = fo and 

1—80 

(3.28) lim inf r(F, f.) = r{F, fo) — S for some F{S > 0). 

Since f. —> fo, there exists a sequence (D*,...*,} of subsets of D satisfying the 
conditions (3.7)-(3.9) and such that 

(3.29) lim Hr,i(Sr, D*^. .ki) = Hf,o{Sr, Dkj^ .i,) 

t —80 

where Hr,i{Sr, D*) is the function Hr associated with ^i(i = 0, 1, 2, • ■ • ). Let 
X be a fixed value of I and consider the corresponding finite sequence {Di,. 
of subsets of D. Let k be the number of elements in this finite sequence. We 
select one point from each element of the finite sequence Let the 

points selected be di, ds, • • • ,dk and let D denote the set consisting of the points 
di, ■ ■ , dk. Let be the decision rule defined as follows: the function 
hr{z^, • • • , x', df) associated with f< is equal to hr.iix^, • • ’ , a:^ D*) where D* is 
equal to the element of the finite sequence (Z)*, which contains the point 
d,{j ~ 1, • • • , fc). Clearly, because of (3.29), 

(3.30) lim f< = f 0 • 

<—80 

Furthermore, for sufficiently large X we obviously have 

(3.31) I r{F, fi) — r(F, ff)\ g e for i = 0, 1, 2, • • • , ad inf. 

Since for finite D our lemma is assumed to be true, we have 

(3.32) lim inf r(F, f.) ^ r(F, fo). 

Choosing * ^ 5 , we obtain a contradiction from (3.28), (3.31) and (3.32). Thus, 
o 

it is sufficient to prove Lemma 3.3 for finite D. In the remainder of the proof 
we shall assume that D consists of the points di, • • • , d*,. 

The probability that we shall take exactly m observations when f ,• is used and 
F is true is given by 

prob. {n = m\F, fj) 

^ / p„{x\ • • • , aj" 1 F)h„Ax\ ■ • • , x”, D)dx\ • • • dx” 


(3.33) 



198 


ABBAHAM WALD 


where Mm denotes the m-dimensional sample apace. Since 
lim , D) =*- , D), 

1«IOO 

it follows from Lemma 3.2 that 

(3.34) lim prob {n — m\F, t,\ = prob {» = m\F, 

1—00 

Hence 

(3.35) lim prob (n ^ m | F, f,) =• prob {rt S m | i?*, f*}. 

{—00 

Smce r{]P, f,) is a bounded function of F and i (i ^ 1), we mu.st have 

(3.36) lim prob {n g m | F, {■<} = 1 (f = 1, 2, ■ • • ) 

m««o 

uniformly in F and i. From (3.35) and (3.36) it follows that 

(3.37) lim prob (n g m | F, fo) " 1 

m—00 

uniformly in F. Because of (3.36) and (3.37), we have 


(3.38) 

r(F, f.) 

= S rm(F, f.) 

m—1 

(t »= 0, 1, 2, • • ■ , ad inf.), 

where 




(3.39) 

U(F, f,) = 

tj , 

1-1 Ju„ 

z”\F)WiF, dd dHm^iSm, d,) 



+ / 

’“,x^\F)c{x\ .<.,*-) dHm.iiSm.D). 

Since 






lim Hm.iiSm , D*) 
{—00 

= HmASm , D*) 


for any subset D* of D, it follows from Lemma 3.2 that 

lim rmiF, fO - r«(F, fo). 

{—00 

Lemma 3.3 is an immediate consequence of (3.38) and (3.40). 

3.4. Eqvakly of Sup Inf r and Inf Sup r, and other iheorenu. In this section 
we sha,ll prove the main theorems for the continuous case, using the lemmas de¬ 
rived in the preceding section. 

Theorsu 3.1. If Conditions 3.1-3.6 ore fulfilled, then 
(3-41) Sup M r(J, f) = M Sup r(f, f). 

Proof: Let Z” denote the class of all f's for which prob {n g m I F} - 1 
for all F. We shall denote an element of Z”' by fFirst we shall show that it 



STATISTICAL DECISION FUNCTIONS 


199 


is sufficient if for any finite m we can prove Theorem 3.1 under the restriction 
that t must be an element of Z^. For this purpose, put Wo = Sup W{F, d) 

F,d 

and choose a positive integer m, so that 

(3.42) c{x\ • • • , a:”) > ^ 

for all m ^ m.. The existence of such a value m, follows from Condition 3.4. 
We shall now show that for any ^ we have 

(3.43) Inf r(£, f”) g Inf r(£, f) + ^ for any m ^ m,. 

fm r 

Let fi be any decision rule. There are two cases to be considered: (a) prob 

(n k m. 1 f, fi} ^ ^ ; (b) prob {n g m, | fi] < |^- In case (a) we have 

^(Si fi) ^ Wo ■ In this case, let fa be the rule that we decide for some d without 

taking any observations. Clearly, we shall have r(f, f?) g Wo and, therefore, 

rfj, fa) ^ r(f, fi). In case (b), let fa be defined as follows: hr(x^, D*) 

for fj is the same as that for fi when r <m,, and Kix^, • • ■ ,x’,do) for fj is equal 
1 

to 1 -■ /iji(x\ ' • • , X*, £)) when r ^ m,, and zero when r > m, where do is a 

k-l 

fixed element of D. Since prob (n ^ m. 1 fi) < , we have 

yy 0 

r(S, f 2 ) ^ r(€, fx) + e. 

In both cases fj is an element of Z”'*. Hence (3.43) is proved. From (3.43) we 
obtain 

(3.44) Sup Inf r g Sup Inf r g Sup Inf r + e. 

{ f { f"** it 

Assume now that 

(3.45) Sup Inf j* = Ini Sup r 

{ r« f" t 

holds for any m. From (3.44) and (3.45) we obtain 

(3.46) Inf Sup r ^ Sup Inf r + «. 

f"t { it 

Hence 

(3.47) Inf Sup r S Sup Inf r + «■ 

r t it 

Since this is true for any e, we have 

(3.48) Inf Sup r ^ Sup Inf r. 

t i it 

Theorem 3.1 follows from (3.48) and Lemma 1.3. 



200 


ABRAHAM WALD 


To complete the proof of Theorem 3.1, it remains to 1)0 shown that (3.45) 
holds for any m. Since D is compact, (3.45) is proved if we can prove it for any 
finite D. In the remainder of the proof we shall, therefore assume that I) con¬ 
sists of h points di, ,dh- Let w be a subset of SI tliat is conditionally compact 
in the sense of the metric’ 

(3.49) So{Fi , Fi) == Sup I f dFi - [ dPA 

3m } 1 

where Sm is a subset of the TM-dimenaional sample space. We shall show that w 
is conditionally compact also in the sense of the intrinsic metric given by 

(3.50) «i(Fi, Fj) = Sup I r(Fi, f") - r(F,, f") | . 

fm 

Let 

(3.51) h{F ,, F,) = Sup I F(Fi, d) - lF(Fj. d) i 

d 

and 

(3.52) Ss(Fi, Fa) = io(Fi, Ft) -j- 5a(Fi, Ft). 

It follows from Condition 3.3 and Theorem 3,1 in (3] that SI, and therefore 
also u, is conditionally compact in the sense of the metric 5s(Fi, Fj). Hence o> 
is conditionally compact in the sense of the metric fi)(Fi, F»). The conditional 
compactness of u relative to the metric 5i(Fi, Fj) is proved, if we can show that 
any sequence (F<| that is a Cauchy sequence relative to the metric 5| is a Cauchy 
sequence also relative to the metric Si . Let {F<1 (f »= 1, 2, • > - , ad inf.) be a 
Cauchy sequence relative to . Then there exists a dlstributioa Ft (not neces¬ 
sarily an element of fl) and a function IF(d) guch that 

(3.53) lim W(,Fi , d) = W{d) uniformly in d 

V—» 

and 


( 3 . 64 ) lim f dF{ = ( dFt 

<-•0 JSn 

uniformly in tS™ . We have 

riFun-ttf 

r-l /-I JUf 

(3.56) ■p{x\ •••,/! F<)W(Fi, di)K{3:\ > ■ •, dt) dx^ ■ - ■ dx^ 

wi » 

+ § L, ® ‘I Z)) dx' • 


dx', 



dF we mean / p„ (a* 


yX”'\P)dx ^... di". 



STATISTICAL DECISION FUNCTIONS 


201 


where Mr denotes the r-dimensional sample space. The sequence {F,} is a 
Cauchy sequence relative to the metric Si if there exists a function such that 

<3.56) hm r{Fi, f”) = rCD 

uniformly in f"". Let f{Ft, f”) be the function we obtain from r[F{, f”) by 
replacing the factor PF(Fv, dj) by W(df) under the first integral on the right 
hand side of (3.55). Because of (3.53), we have 

<3.57) lim lr(F,, f“) - f(Ff, f”)] = 0 

uniformly in fThus, (3.56) is proved if we can show the e.xistence of a func¬ 
tion f(f”) such that 

<3.58) lim f(Fi. n = HD 

uniformly in f”. Let C be a class of functions , x”) such that 

I ^(x^, ■ • • , x") \ < A < CO for all in C. 

It then follows from (3.54) that there exists a functional g{(p) such that 

<3.59) lim f ipdF, = g(tp) 

» •'Afm 

uniformly in <(>. Application of this general result yields (3.58) immediately. 
Hence, {F<] is a Cauchy sequence relative to the metric 5i and, therefore u is 
shown to be conditionally compact relative to the metric Si if it is relative to 
the metric 5o. 

It then follows from Theorem 3.2 in [3] that Sup Inf r = Inf Sup r if we replace 

{ f» r” ( 

by a subset u that is conditionally compact relative to 5o .f Since 12 is separable 
relative to 5o, there exists a sequence {12,} of subsets of 12 such that 12, is condi¬ 
tionally compact relative to Jo, 12,+i 3 12, and 12,' = 12* is dense in 12. Let 

denote an a priori distribution f for which 5(12,) = 1. Since the left and right 
hand members in (3.45) remain unchanged when 12 is replaced by 12*, it follows 
from Theorem 1,3 that equation (3,45) is proved if we can show that 

(3.60) lim Inf Sup r = Inf Sup r. 

M f" f" f 

Let {fT) (< =s 1, 2, • ■ • , ad inf.) be a sequence of decision rules such that 
<3.61) lira [Sup rQ\ f?) — Inf Sup r] = 0. 

I-OO fW 


“ Strictly, we would have to write Inf instead of Inf where is a probability measure in 

the space of all f”. But, since the use of any discrete probability measure is equivalent to 
the use of a f”, and since the restriction to discrete i/" does not change Sup Inf r or Inf Sup r 

f I"* i" f 


we can replace Inf by Inf. 



202 


ABRAHAM WALD 


Accorduig to Lemmas 3.1 and 3.3, there exists a subsoqueneo {i,} of (tj and 
a decision rule such that 

(3 62) hm inf r(F, ^ r(F, f?) for all 

Since is dense in it follows from (3.61) and (3.62) that 

(3.63) Sup r(F, f?) ^ lim Inf Sup r 

p i-<c fi* {< 

and, therefore, 3.60 holds. Thus, (3.45) is proved and the proof of Theorem 

3.1 is completed. 

Theorem 3.2. If Conditions 3.1-3.5 are fulfilled, then there exists a rninimax 
solution, i e., a decision rule ^afor which 

(3.64) Sup r{F, fo) ^ Sup r(F, f) for all f. 

p p 

Proof: Because of Theorem 3.1 there exists a sequence jf,} (r = 1, 2, • • • , ad 
inf.) of decision rules such that 

(3-65) 11m Sup r(F, f.) = Inf Sup r{F, f). 

i-ao P f / 

According to Lemmas 3.1 and 3.3 there exists a subsequence jfq} of (i'ij and a 
decision rule fo such that 

3-66 lim inf r(F, fq) ^ r{F, fo) for all F. 

It follows from (3.65) and (3.66) that fo is a minimax solution and Theorem 

3.2 is proved. 

Theorem 3.3. If Conditions 3.1-3.6 are fulfilled, then for any ^ there exists a 
Bayes solution relative to f. 

This theorem is an immediate consequence of Lemmas 3.1 and 3 3. 

Theorem 3.4. If Conditions 3.1-3.6 ore fulfilled, then the class of all Bayes 
solutions in the wide sense is a complete class. 

The proof is omitted, since it is entirely analogous to that of Theorem 2.5. 
3.5. Formulation of an additional condition. In this section we shall formulate 
an additional condition which will permit the derivation of some stronger 
theorems. Let the metric So{Fi, Ff) be defined by 

«o(Fi,Fs) = Z~Sup I /* dPx- I dFi\ 

m-I m an Jsn Jsn 

where 8m may be any subset of the 7n~dimenBional sample space. 

Condition 3.6. The apace 0 is compact relative to the metric fio(Fi, Ft) 

lim TF(F*, d) = r(Fo, d ) 

t 

uniformly in d if lim «o(F., Fo) = 0. 

X 

Theorem 3.5. If Conditions 3.1-3.0 hold, then 



STATISTICAL DECISION FUNCTIONS 


203 


(i) there exists a least favorable a priori distribution 
(li) any minimax solution is a Bayes solution in the strict sense 
(iii) for any decision rule fo which is not a Bayes solution in the strict sense and 
for which r{F, fo) is a bounded function of F there exists a decision rule which is a 
Bayes solution in the strict sense and is uniformly better than fo. 

Proof: The proofs of (i) and (li) are entirely analogous to those of (i) and (ii) 
in Theorem 2.6, and will therefore be omitted here. 

To prove (iii), let fo be a decision rule that is not a Bayes solution in the strict 
sense and for which r{F, fo) is bounded. We replace the weight function W{F, d) 
by d) = W{F, d) — riF^ fo). We shall show that (i) remains valid when 
W(F, d) is replaced by d) This is not obvious, since r(F, fo), and there¬ 

fore also W*(F, d) may not be continuous in F. First we shall prove that 

(3.67) lim inf r({<', fo) ^ , h) 

for any sequence {f<) for which 

lim ^.-( 0 )) = Jo(o>) 

1—op 

for any open subset w of (in the sense of the metric So) whose boundary has 
probability measure zero according to fo. Let rm{F, f) denote the conditional 
expected value of the loss W{F, d) plus the cost of experimentation when n m, 
F is true and the rule f is used by the statistician (see equation (3.39)). Since 
^(F, d) and the cost of experimentation when m observations are taken are 
uniformly bounded, one can easily verify that 

(3.68) lim rm(F,, fo) = rm(Fo, fo) 

»—«o 

for any sequence {F,-} for which 

(3.69) lim 3o(F., Fo) = 0. 

«0 

Hence, since is compact (Condition 3.6), 

(3.70) limr,„({;,i-o) = r„({o',ro) 

I—« 

where 

(3.71) r„(f,fo) = f r„(F,fo)dt. 

JQ 

Since 

CO 

r(J, fo) =■ 2 J’m(f, fo) 
inequality (3.67) follows from (3.70). 

The remainder of the proof of (iu) will be omitted here, since it is the same 
as that of (iii) in Theorem 2.6. 



204 


ABRAHAM WALD 


We shall now replace Condition 3.6 by the following weaker one. 

CoNDiTioiT 3.6*. There exists a sequence {£2tl (i = I, 2, ■ • • , ad inf.) of stib- 

seis of f2 such that Condition 3.6 is fulfilled when n is replaced by n,-, iii+i O f2, and 

lim Qi = fi. 

1—00 

Theorem 3.6. If Conditions 3.1-3.5 and 3.6* are fulfilled then 

(i) A minimax sol'AUon fo and a sequence {f,-} {i = I, 2, ■ • ■ , ad inf.) exist 
such that lim = fo and f, (i = 1, 2, • ■ • , ad inf.) is a Bayes solution in the strict 

t—00 

sense. 

(ii) For any decision rule to for which r{F, fo) is bounded there exists another 
decision rule such that fi is a limit of a sequence of Bayes solutions in the strict 
sense and r(F, ti) ^ r{F, fo) for all F m 12. 

Proof: According to Theorem 3.5, for each i there exists a decision rule 
t, {i = 1,2, ■ ■ , ad inf.) such that f, is a minimax solution and a Bayes solution 
in the strict sense when 12 is replaced by 12,. Let be a subsequence of the 
sequence (f.) such that (f, ) admits a limit J'o, i.e., lim f,, = to > Because of 

Lemma 3.3, 

(3.72) lim inf r(F, fq) ^ r(F, to)- 

Hence to ia a minimax solution relative to the original space 12 and statement 

(i) is proved. 

To prove (ii), replace W{,F, d) by W*iF, d) = W{F, d) - r(F, to) where fo 
is a decision rule for which r(F, to) is bounded. . In proving statement (iii) of 
Theorem 3.5, we have shown that there exists a decision rule tu(i = 1)2, • • • , 
ad inf.) such that fii is a minimax solution and a Bayes solution in the strict sense 
when 12 is replaced by 12. and W(F, d) by W*(F, d). Clearly, tu remains to be a 
Bayes solution m the strict sense also relative to 12 and W{F, d). Since tu is a 
minimax solution relative to 12. and W*{F, d), we have 

(3.73) r{F, tu) ^ r{F, to) for all i?' in 12i . 

Let (fi,,) be a convergent subsequence of {fi<} and let lim tu, = fi. Then,. 

i-X, 

because of Lemma 3.3, we have 

r(P) ti) ^ riF, to) for all F in 12. 

Since fi is a limit of a sequence of Bayes solutions in the strict sense, statement 

(ii) is proved. 

Addition at proof reading. After this paper was sent to the printer the author 
found that 12 is always separable (in the sense of the convergence definition in 
Condition 3.6) and, therefore, Condition 3.5 is unnecessary, A proof of the 
separability of 12 will appear in a forthcoming publication of the author. 

The boundedness of r(F, t.) is not necessary for the validity of Lemma 3.3. 
= fo nnd suppose that for some F, say Fo, r{Fo , ti) is not bounded 

in t. If lim^nf r(Fo, ti) = Lemma 3.3 obviously holds for F ^ Ft, If 



STATISTICAL DECISION FUNCTIONS 


205 


r{Fti, fv) = g < », let [^yj be a subsequence of {i} such that 

1^00 

lim r(Fo, fvy) = p. Since r{Ft, is a bounded function of j, Lemma 3.3 is 

applicable and we obtain g ^ r{Fi , fo). In a similar way, one can see that also 
Lemma 2.4 remains valid without assuming the boundedness of r{F, i;.-). 

Although not stated explicitly, several functions considered in this paper are 
assumed to be measurable with respect to certain additive classes of subsets. 
In^the continuous case, for example, the precise measurability assumptions may 
be stated as follows: Let B be the class of all Borel subsets of the infinite di¬ 
mensional sample space M, Let H be the smallest additive class of subsets of 
fi which contains any subset of n which is open in the sense of at least one of 
the convergence definitions considered in this paper. Let T be the smallest 
additive clnss of subsets of D which contains all open subsets of D (in the sense 
of the metric fi(di, dj, 0)). By the symbolic product if X T we mean the 
smallest additive class of subsets of the Cartesian product Si X i) which con¬ 
tains the Cartesian product of any member of H by any member of T. The 
symbolic product if X B is similarly defined. It is assumed that: (1) W{F, d) 
is measurable (if X T); (2) (xS • • •, x"* | F) is measurable (B X if); (3) 
5*1 ..*'(B*) is measurable (B) for any member D* oi T] (4) Zr{x\ • • ■, xO and 
c,(x\ • > ■, X') are measurable (B), These assumptions are sufficient to insure 
the measurability (i/) of r(f’, {■) for any f. 

REFERENCES 

[1] J V, Neumann and Oskar Morganstbin, Theory of Games and Eonomic Behavior, 
Princeton University Press, 1944 

]2] A. Wald, “Generalizatoin of a theorem by v. Neumann concerning zero sum two-person 
games," Annals of Mathematics, Vol. 40 (April, 1946). 

[3] A. Wald, “Foundations of a general theory of sequential decision functions," Eco- 
nomelrica, Vol, 16 (October, 1947). 



THE MULTIPLICATIVE PROCESS 
By Richard Otteb’-® 

Universily of Noire Dame 

1. Introduction and summary. The multiplicative process is usually defined 
by the sequence of random variables Xo, Xi, ■ • • whose distributions are 
specified as follows: P(Xo = 1) = 1, Sr-oP(Xi = >') = !, and if X„ == 0 then 
P{Xn+i = 0) = 1, whereas if X„ is a positive integer then X„+i is distributed 
as the sum of X„ independent random variables each with the distribution of Xi. 
The variable X„ is interpreted as the number of “particles” in the nth generation, 
and the index n as a discrete time parameter. This has been the method of 
approach in previous studies of the process [1, 2, 3, 4, 5] The multiplicative 
process has various applications, notably in the study of population growth, the 
spiead of epidemics or mmors, and the nuclear chain reaction. The closely 
related “biiih and death” process was recently studied by Kendall [6]. 

Whenever one studies the probability theory of a particular system there 
seem to be definite conceptual advantages in defining explicitly the set T of 
elementary events, the additive class M of subsets of T, called events, and the 
probability measure P for the events of 9)2. Now an elementary event of this 
process can be represented by a rooted tree where the original particle is repre¬ 
sented by the root vertex and where the particles of the nth generation are 
represented by the vertices n segments removed from the root. The tree will be 
finite or infinite accordmg to whether a finite or an infinite number of particles 
are involved in the elementary event. Thus, the set of trees is the natural 
choice for T. The first part of this paper is devoted to a more precise description 
of J, 9)2 and P. We shall then see easily that Xn{t), the number of vertices n 
segments removed from the root of 2 e i.e. the number of particles in the nth 
generation, has the distribution defined m the preceding paragraph. Since the 
time does not appear in our description of IT we fetter ourselves somewhat if we 
interpret n as a discrete time parameter. Thus, we have already r6a.pGd some 
harvest from considering the process from the point of view of fj". Another 
advantage is that we are led m a natural way to study the distribution of other 
stmctural features of the trees, e.g the total number of vertices, or the number of 
vertices with k outgoing segments. 

The chief results of this paper are as follows. The recursion formula for the 

probability P^ that a tree have n vertices n =* 1, 2, • • ■ is obtained as well as an 

asymptotic estimate of P„ valid for large n. The distnbutions of the number of 

branches at the root in a finite tree, an infinite tree, or in a tree with n vertices 

are obtained and the asymptotic distribution of the latter as n -+ m. The 
--— 

^ Research under an Office of Naval Research contract. 

»The author wishes to express his gratitude to Professor E. Artin of Princeton University 
for the suggestion of this problem and his encouragement towards its solution. 

208 



THE MULTIPLICATIVE PROCESS 


207 


distribution, of the fraction of vertices with k outgoing segments in the finite 
trees, in the trees with n vertices, and the asymptotic distribution of the latter 
as II —» are also found. Finally, an estimate is obtamed for the probability 
that a tree be finite m case this probability is near 1, a result which was previously 
obtained by JColmogoroff [7]. 

2. The space of trees. We shah use the notation {aj, (ai, a^, ■■■ a„], 
{«,},</ , and [a, | Ejjtj to denote the sets which consist of respectively the single 
element a, the elements m , aj , - • • a„ , all a, with j e J, and all a, with the 
property R and j t J. We denote the union of two sets j 4. and J5 by rl + B, their 
intersection by AE, and the cartesian product of n identical factors each of 
which is A by 

Let I denote the set of positive integers. We assume given for each n el a, 
countable set Un of objects ..called vertices, i.e. 

Un = {u.iiii-. i«) (li.ij—• 

Let liQ be a verte.\ distinct from all the other vertices and let U = {uo ) H- S C/ft 
be the collection of all the vertices. We shall interpret uo as the original parent 
particle and the vertex itus, for example, as the second son of the fifth son of the 
first son of the original particle. If s is a subset of U, s C U, and if h, tj!, • ■ • i„+„ 
are such that , • • • Wn.'j...,'M'i.+i ■ >»+„ each belong to s then 

this sot of vertices is called a path from to Wiix,m s and m > 0 is the 

length of the path or the distance from to Mu,,.. If m = 1 we call 

the path a segment, for short. 

For the sake of convenience let us agree to put , (n > 1) 

then we define W(5, «), ior u e s C U, to be the number of segments from w in s, 
and we call W(s, u) the type of the vertex u in s. If f is a subset of U, then we 
call i a tree if and only if 

(1) W(<, u) < <» for net 
and 

(2) «i implies ■Ui,.-,. t for v = 0, 1, ■ • • i„. 

Let IT be the set of all trees. The condition (2) clearly implies that for each 
i«Twe have lie «t and that there is a unique path from -Uo to any other vertex 
of t. Hence, whenever a path exists between any two vertices of t it isunique. 
We call lie the root of <. K for u «t«fTwe have W{t, u) = 0 then u is called an 
endpoint of t, and the vertices of t which are not endpoints are called mner 
vertices. (It is to be noted that the objects we call trees here are rooted trees 
in the sense of Cayley but our trees have their vertices numbered as well. 
Usually one would identify the trees [ue, Mi , ve, Uu) and {iie, Ui, Ue, ihi], 
but we do not wisli to do so because for us it is distinctly different whether the 
grandson is sired by the first son or by the second son.) 

For « e < e Twe define the iranch of i at w to be the set of all vertices belonging 



20S 


BICHAJiD OTTBB 


to any path from umt. Our convention of admitting paths of length 0 implies 
that u « u). In fact, il W(t, u) =0 then i{t, «)={«}. If i' is a tree such 
that t' czt then we call t an extension of t\ denoted < > or /' < i, if IF(i', ti) > 0 
implies W{t', u) = W{t, u). Thus < > is equivalent ioiZi i' and 

i = <' + 2 Wi w) 

« 

where u runs through all the endpoints of I'. The extension relation imposes a 
partial ordering upon CT. 

The extension t of t' is intei'preted as a possible future aspect of a family tree 
when its structure at present is given by i', all present members of the family 
who have progeny bemg regarded as sterile. 

If w = then the mapping <p defined for the vertices of b{t, u) by 

putting 

■ ‘n'li+I ••»n+m) ~ ^'n+l ■••I'n+m 

maps i(t, u) one to one onto a tree <fi(b(t, u)) in such a fashion that if {vi, vj} is a 
segment from vi to ws m b(t, u) then {<p(vi), v’Cws)) is a segment from v5(t;i) to 
(pivi) in (p(b(t, u)). We call the mapping <fi a homeomorphism and wc say that 
b{i, u) is homeomorphic to tfi(b(i, u)). 

If a tree contains a finite number of vertices then it is called a finite tree; 
otherwise it is an infinite tree. Let IFdenote the set of all finite trees and 5" the 
set of all infinite trees, and let 3^"denote the set of non-negative integers. For 
each k edCvfe define Yk{i) for < e 3"to be the number of vertices of typo k in. 1. 
When it is clear to which tree / we refer we shall usually abbreviate Yo{i) by m, 
and we agree not to use the letter m with any other connotation. For each 
r elJlet ei(T), SjCT), ■ • • e^iT) denote its m endpoints. We then define for 
TelFand k = (fci, • /c„) « 

[T,k] = (i|< > T, Wit,efT')) = k,,i = 1,2, m’eW}, 

and we call [T, k] a neighborhood. For each i e [T, ic] we say [T, k] is a neighborhood 
of t. Then it is easy to show that CT is a topological space where the neighborhoods 
defined above form the defining system of neighborhoods [8]. 

3. The measure theory in Tl In the following paragraphs an outline of the 
measure theoiy in CT is given which omits proofs for the most part since they are 
easily constructed. The only point of difficulty arises in showing the measure 
function to be completely additive, but here the outline has more detail. 

Let © be the collection of subsets of J" such that 0 « © and any other set /S 
belongs to © if and only if there is a f e 7 and a non-void "rectangle set” 
.^1 X Ai X ■ An C X'"’^ m = Yaii), such that 

(3) ^ = E [t ,«] 

where the sets , Aj, ■ • • may be finite or infinite sets of non-negative 



TEE MULTIPLICATIVE PHOCBSS 


209 


integers. The collection of neighborhoods which appear as terms in (3), i.e. 

{[i| '(] i ) "we caU an ^-pariition of iS, and t is called the generator of the (S- 
partition. Only a finite number of ©-partitions are possible for an tS e ©, 
because only a finite number of trees can possibly be generators and there is 
only one ©-partition per generator. With respect to our partial ordering of the 
trees all possible generators lie between two particular ones. We call the 
smaller of these the irreducible generator and the corresponding ©-partition the 
irreducible ^-’partition of S. Any partition of nS into neighborhoods must be a 
subpartition of this irreducible ©-partition The elements of © also display 
two important properties of the rectangles in Euclidean space, namely if S 
(S' € © then 

(4) SS' e © 

and if (S C (S' then there is a finite chain 

(5) (S = (So C (Si C •. • (S„ = (S' 
such that (S(, (S, — S<_i« © for i = 1, 2, • ■ • n. 

A class of sets with the properties (4) and (5) has been called a half-ring by von 
Neumann [9]. 

Let po, Pi, ■ ■ • be given non-negative numbers such that 2“= 1. Tor 
t e fFlet us put 

(6) p(o = n py‘'> 

with the convention 0° = 1. We then define the measure function P for the 
sets in © by 

P(0) = 0 

(7) P{[i, k]) = Pk^ ’where k = (fci , fc, • • • fc„) e X'"’ 

P('S) = S P([^i '<])( where {[t, *]} ..^ is the irreducible ©-partition of S. 

KCj. 

P is evidently non-negative, Letting t be the tree with one vertex and putting 
A = X gives P(T) = 1. It is easy to see that P is completely additive for the 
©-partitions of a neighborhood, but this implies P is completely additive for the 
©-partitions of an arbitrary element (S of ©. In order to show that P is com¬ 
pletely additive for any partition of S into elements of ©, it is necessary and 
sufficient to show this for an arbitraiy partition of a neighborhood into neighbor¬ 
hoods. One may reach finer and finer partitions of a given neighborhood N by 
replacing a neighborhood in any one partition by an ©-partition of the neighbor¬ 
hood, and repeating the process. 'The sum of the measures of the sets in the 
partition is invariant under such a replacement, On the other hand it can be 
shown that all possible partitions of N into neighborhoods may be reached in 
this way. More precisely, let be a partition of a neighborhood N 



210 


HICHAHD OTTER 


into neighborhoods N, We call N reduced if whenever a subset of If is an 
(g-partitvon of a neighborhood M CN then the partition consists of ilJT itself, i.e. 
it is the irreducible ©-partition of Jkf. Then we have the following theorem: 

Theorem 1 If jif is a reduced parliiion of a neighborhood N into neighborhoods 
then 1? = (iV). 

The proof is indirect and proceeds by constructing a decreasing sequonce of 
neighborhoods contained m N whose limit is not void and yet has nothing in 
common with any N,, but this is a contradiction. 

Let g consist of all those sets which may be formed by finite unions of disjoint 
elements of a half-ring ©, then g is a field of sets. If F is a completely additive 
measure on © then its natural extension Pi is completely additive on 5 [9]* 
Kohnogorofl [10] has shown that the completely additive measure Pi may 
always be extended to a completely additive measure Fs on the Borel field 5DI, 
i.e. the smallest additive class of sets containing 5- Since FsCfT) = 1, Fi is a 
probability measure. For simplicity we put Fi = F. Let us also agree that if 
ikf is the set of all trees with the property R we may write F(i£) instead of F(Af). 
If hf is a set with PiN) > 0 then P{M/N) shall denote the conditional probability 
of M, given N, i.e. P(M/N) = (P(N)T''P{MN). 

4. Independence of the branches. In the multiplicative process the events 
occurring in one branch of a tree are independent of those in a second branch 
disjoint with the first and it is for this reason that the process is relatively simple 
to analyze. In this section we shall try to expose the character of this 
independence. 

For T e IT, let Si be the set of all extensions of T, then 

Sr = E IT, ic], 

cy (m) 
v«tA 

whence by (6) and (7) F(Sr) = P(r). The following lemma is then easUy 
established. 

Lemma 1. If P(St) > 0 then W{i, ei{T)), i = 1,2, ■ ■ • m, under the condition 
i e&T, are independent-random variables each with the distribution, 

(8) P{W{t, ei{T)) = k/^r) = p* fc = 0, 1, 2. • ■ ■ . 

In the particular case where T = {wo} we have = ST and we put 
W{t) = W{t, uf) for short. Thus W[t) tells what type of vertex the root of t is 
and (8) becomes 

F(W = h) = p„ fc = 0, 1, 2, • ■ • , 

For t e IT and n = 0, 1, 2, • • • let Xn{t) be the number of vertices of 
t at distance n from its root. Then X„{t) = 1 and Xi{t) = W(f). If n, r are posi¬ 
tive integers then there is at least one T e fF which has r of its endpoints, say 

e,i(T), ••• e,',(T), at distance n from the root and which also satisfies 

X„+i{T) = 0. Put 

Sr'"'*' = \t 1 Wit, eiiTf) = 0, j , ... iV , / e 



Evidently for t e Sr'" 


THE MULTIPLICATIVE PROCESS 


211 


^n+i(0 “ L W{t, e.XT)), 

►-1 

and a proof similar to that of lemma 1 gives 
Lemma 2. If P(Sr‘ **■) > 0 then Xn-n(t), under ike conditim ItSjf"'*', 
is the sum of r independent random variables each with the distribution of Xi. 

By (6) and (7) for ^« 9ll C 

P(i) = ftp"'"'. 

which depends only upon the type of each vertex as it occurs in t. For those 
vertices which are inner vertices of T, Y,(i) is constant. Any other vertex 
belongs to one and only one of b{i, ei(T)), b{i, e%{T)), • ■ • h{i, e„(T)) and its 
type in i is, of course, the same as its type in the branch to which it belongs. 
Furthermore, each branch is homeomorphic to just one tree in fF, 

h(t, ei(T)) r^ii, t = 1, 2, m. 

Since the type of a vertex is preserved under homeomorphism we have 

Pit) « P(Sr)P(h)P(fc) ••• PiU). 

If, as t runs througli OF, (ij, is, • • • t„) runs through ?»Ii X OKs X • • • OlU , 
we obtain 


(10) P(0K) « P(St)P(9Ki)P(OKs) • • ■ P(OK„). 

Let us hereafter put p = P (3“). In the particular case of (10) where OK =« 
we clearly have OK( « (F, t =■ 1, 2, • • > m, hence 

(11) P(35r) = P(gr)*p”. 


If we define T,, 
WiT,) = V then 

( 12 ) 


0, 1, 2, •' • , to be the tree with v + 1 vertices which has 
(F = (Uo) + Sr, , 

OP 

if =* (uo) + 23 3Ss>,, 


where 


(tloj “ 0, 
P{^T,) = P, , 
From (11) and (12) we get 


i ^ j] 

1 / == 1 , 2 , ••• . 



= p. 


(13) 



212 


RICKARD OTTER 


For 1 1 5Sr let Z(b(i, ei{T))) be the number of vertices m the branch of t at 
«i(r). In the particular case where T - (lio) we have b(t, rto) =« t and Z{t) 
is the number of vertices of < If now 


= {t (Z(i) = n,t4l7); n»l, 2, 
= Pi%), 

then by putting 9Jl = SR?*'""” where 


(14) 


OH?'-’'" - {t\mt,e.(T)))^n,, 
= ^1., i = 1, 2, • - ■ m, 
we may apply (10), which gives 
(15) Pit «ySr, Z(b{l, eim = n., i = 1, 2. 


I « 1, 2, m, teffSrl, 


n) 


P(§r)Pn,i’„, • • ■ P„^ . 

If p > 0 we may multiply and divide the right hand member of (15) bv n” 
which leads us to the following lemma; 

^ ^ « 1. 2, ... m. under 

cc^xtwnt t are m ttuiepmdml random mriabtea each with ihe diairibuHon 
of Z{t), given 1 1 IF. 




(16) 


f(w) = E p, 


w 


» i, . »mpl„ vamW. K «, i, i„,™w ta .tadybg th, 
jto, Xi , .. then one should define another sequence of functions /„ A • • • 
where /o(w) = w and /„,,(w) = /(fnd^)) for n = 0, 1, 2, • • By costing 
rmally the expansion of /„(«,) around w = 0 it is not difficult to shoJ that 

Mw) is the generating function for X, , i.e. /„(w) = g P(X„ =. which is 

P'®™ ratigations of the multipLcative process 
fo 1 r atercsted in the distribution of Z we defSe g>(z) 

to be the corresponding generatmg function, i.e. ^ 


(17) 


0‘(c) 


t.PnZ'^ 


of (16) and (17) reapectivliriE7(lt 7rd^ W'^'^1 we tow 
Theorem 2. Let 

®(*» W) =• 2/(tc) — 10 , 



THE MULTIPLICATIVE PROCESS 


213 


then w = £?(a) is the unique anaiyUc solution of 
(18) §(z, w) = 0 

in a certain neighborhood of (0, 0). 

Proof. Since ^P(^) is analytic at 0 and (P(0) = 0 it suffices to show that if we 
substitute formally ^ Pfor uo in g T^, p,w' the coefficient of g” is uniquely 
determined and is P^. 


/ « 


(19) g S p,(^(z)y = poz + X) (£ 12 p> 

n"“2 \f*“1 1 


PfliPfij' 




If in (14) we put T = T, , where T, was defined just before (12), then 
m = 7o(rO = Let us require in addition that the total number of vertices 
in the branches be n — 1, i.e. ni + rh + • ■ ■ n, = n — 1, then 


( 20 ) 




p—l 2 n<—n—1 


r. 


n = 2, 3, 


where 


= 0 , 


unless i = j and ni = wii, ni « mi, • • • n,- = ni/. By applying P to (20) and 
using (15) we get the coefficient of Z” in (19) for n > 2. This together with the 
obvious fact that Pi = po completes the proof. 

It is worthwhile noticing that by means of the formula of Burman and La¬ 
grange [11] we can solve the recursion formula for Pn in terms of po, pi, • ■ • , 
namely 


( 21 ) 



i: 

J/**/—n—1 


(ra - 1) 1 

>'01 Vi 1 • ■ • 


Po'"pi'‘- 


Now if i has n vertices we know from Euler’s characteristic that 
'^jYj{t) = n -- 1. Since P(<) = we see from (21) that 

(n — 1) 1 ^ . 

volvil---’ 1. 

is the number of trees in for which Fo(i) = vo, Fi(0 = Vi. • ■ ■ . 

Evidently v) « £P(g) remains a solution of (18) for all z such that | z | < cc, 
I «) I < p. In case po »> 0 the constant 0 solves (18). Hence £P(g) = 0 for all 
g and so (P(l) »=. p 0. Conversely, if p » 0 then Pi » po = 0 which gives 
Corollary 1. p « 0 g/ and only if po = 0. 

Since we wish to investigate the distribution in (F we shall henceforth assume 

Po 0. 

Any non-constant function g(s) which has a power series development pos¬ 
sessing non-negative coefficients g(z) = 2^ > 0 with a positive radius of 

convergence R has two properties that are important for us: 


(22) g(z) has a singularity at R. 



214 


BtCHAKD OTTER 


(23) If 2 “v®" converges then o^S converges absolutely and uniformly 

for 12o 1 = R, and so the series defines a continuous function g(z) there. We 
have , g{z) = X) n>.zo as long as the path of approach to za lies in 1 z j < 

R. On the other hand, if as z approaches R through real values below R> 
z R~, the limit of g{z) exists then ^ a^R' converges. So if we put p(2?) « 

S'(^) = S °''>R'' then the meaning is unique even allowing m as a value. 
Returning to 9’(z), if for | z | < a we have \ io \ ^ p where w « then 


(24) 



9-\w\ 


which shows the mapping is schlicht in such a domain and that the image domain 
cannot contain zeros of J(vi). Because of (23) and the fact that £P(1) is finite 
even if a = 1 we see that the mapping is certainly one to one for | z | < 1, 
ConoLiiAnT 2. p is the smdllesl root oj f{v>) = w in 0 < w < 1. 

Proot. (13) shows p is a root in the interval. If for 0 < tco < p we have 
/(lUo) = Wo then by (24) (P'^ico) = 1. 

The following corollaiy is the well knoivn criterion for extinction 
CoroluaryS. p - I if and only if f{l) < 1. 

Proof p = 1, po > 0, and the convexity of /(to) in 0 < to < 1 guarantee 
that (f(w) ~ l)/(w - 1) is bounded by 1 and is monotonic increaamg with to. 
Hence/'(I) exists and is < 1. 

Conversely, if/'(I) < 1 then either/'(w) is constant (= pi < 1) in 0 < to < 1 
or else it is strictly increasing ivitii w and in cither case /'(to) < 1, The 
mean value theorem gives/(to) > loin 0 < to < 1, hence p =» 1. 

Putting a = (P(a) we have the following lemma: 

Lemma 4. a < p 

Proof. We already know that 9’(z) has a unique analytic inverse given, by 

(24) for 1 fffz) I < p, but on the other hand E^'Ce) 0 for 0 < z < a so this 
inverse is analytic for 0 < to < a. If we had a > p we could continue /(to) 
analytically by means of (24) along the real axis past its Mgularity at p, but 
this is impossible. 

Corollary, p = 1 i/ and only if a > 1. 

Proof. The necessity follows from the monotone behavior of EP(z) for 
0 < z < «. Conversely, if a > 1 then z = « 1. 

Theorem 3. 1/ po + pj 1, then 

(25) a and a are finite-, 

(26) /(o) = o/a; 


(27) f{a) < 1/at where the strict inequality can hold only i/ a = p. 

Proof. Let r > 2 be such that p, ^ 0, then for 0 < z < a, we get from the 



THE MULTIPl/ICATIVB PROCESS 


215 


functional equation 

zpriS'iz})' - iP(z) < 0; 

0 < B’Ca) < (-i-) 

WrJ 

By lotting za- we see a is finite and f?(3) is bounded. Since 9(z) is mono- 
tonic in this region we get a < «. By lotting z-* a in S(z, 9'(z)) we get (26). 
For 0 < 2 ^ a, 9a(z, ff(z)) = zf'(fP(z)) — 1 is continuous and monotonic in¬ 
creasing with 2 and is < 0 for z near 0. From the general theorem on implicit 
functions we Imow 9„(2, ff(z)) 0 for | 2 | < a, so if we let 2 a we obtam (27). 

If a = p (27) merely guarantees the finiteness of f'(p) and gives an upper bound. 
One can easily construct an example where 1/a is the least upper bound and 
one where it is not. 

But if a < p then since §(z, w) is analytic at (a, 0 ) and @(a, a) = 0 we obtain 
from the implicit function theorem the strict equality in (27). 

C 0 ROI 1 LA.EY, 7/ a = 1 then a = p = 1. 

Phooi’. By (26) 

(28) /(a) = a => 7(1) = p < 1. 

If a < p then /'(p) = 1 so p = 1 from the convexity of /(w). If a == p then 
o > 1 which when combined with (28) gives a = 1. 

The case where po + pi = 1 escapes Theorem 3 but it is easily examined 
separately, namely 

f{w) = po + piw, Po 5 ^ 0, 

S’C*) = Lpopr'a" = 

n-l 1 — Pl2 

Hence p = 1, a = 1/pi and a = p — 00 . 

For the practical applications of the theory it is valuable to know some 
conditions which guarantee a < p, and thus strict equality in (27). From the 
foregoing analysis it is evident that one such condition is p = <», i.e. f{w) is an 
entire function, and another is /'(I) >1. If one has enough information about 
f(w) to plot its graph for real positive w then the line through the origin tangent 
to f(w) in the first quadrant touches the curve at the point (a, a/a) from which 
we determine both a and a. 

6. Asymptotic properties of the distributions. If we examine the terms of 
the sequence po, pi, • • ■ we may find that the indices of the non-zero terms are 
all multiples of some common integer larger than 1. In this case we should 
expect to have =» 0 with the same sort of regularity. So let us define q to 
be the largest integer such that p, 9 ^ 0 implies v is a multiple of q. Clearly we 
have g > 1 and g = 1 means there is no integer other than 1 which divides the 
indices of all the non-zero p,. Of course, pi 5 ^ 0 implies g = 1 The following 
theorem establishes an asymptotic estimate for P„ valid for large n, provided 



216 


BICHAED OTTKB 


B - 1 is a multiple of and incidentally shows that » 0 , if n - 1 is not a 
multiple of q. 

Thbobem 4. Ij a < p then 




(29) P„ 


i.e. for large n ^ 1 (mod 5 ) 


, n ^ 1 (mod g), 




PBoor. Let us put B = 2ir/q, then for 1 ta 1 < 0 , 

|/(w)|= Spuru"* < Z) Pita 1 w I*® =/(I w I), 

t-O i-» 

and the equality evidently holds if and only if arg w is an integral multiple of B, 
Furthermore, if m is such that | /(lu) I = /(I la 1) and we put z «» iP~^(u)) then 
w = ff’(w/f(‘w)) so we get 

hence P„ == 0, if n 1 (mod q). 

For I z I = ft and vi = 9{z) the point (z, w) satisfies (18) by (23). If we put 
Zp ae , 

Wr = afi’'*, r = 0, 1, • • • 5 — 1, 

then w, = S’(z,) and 

§„(z,, w,) = Zyf{w,) ~ 1 = af'{a) — 1 = 0, 

so that Zo, 2i, ■' • Zj-i are certainly singularities of 0’(z). But /(to) is analytic 
at w, and/(to,) = a/a ^ 0, so the solution of (18) for z, 

is analytic at w, . Furthermore 


f S’-V.) = = 0, 

dta f(w,) 


/i srv) = . -'i£M ^ 0, 

dw^ /(to,) IS, 

which shows that £P(z) has a branch point of order 1 at each z,, i.e. S’(z) is an 
analytic function of (z - z,)’'® in the neighborhood of (z,, w,), v = 0,1, • • ■ g - 1. 
For 1 z I = a, te = £P(z) but z z, we obtain 

1 @„(z, tr) I > 1 - a |/'(ti>) I > 1 - af(| u) 1) > 1 - af'ifl) - 0, 



THE MULTrPLrOAOOVE PROCESS 


217 


hence 9’(z) is an analytic function of « in a certain neighborhood of such a 
pair (?, w). 

By analytic continuation we find a circle of radius > a such that ff’iz) is an 
analytic function of (z — z,)'^* for | z | < (3. If we make radial cuts in this 
circle running outward from each z, then in the resulting domain D each of the 
fimctions (z ~ Zr)^^“ is an analytic function of z hence so is £P(z). 

Let r be the path consisting of the boundary of D oriented in the posi¬ 
tive sense, let y be the part of r lying m the sector —v/q < arg z < w/q, 
and let y' be that part of y leading from /9 to a along the lower lip of the cut at 
a, thence along the upper lip back to 0. Since £P(z) satisfies the relation 
= e'*^£P(z} for v = 0, 1, • * ■ g — 1, we see from Cauchy’s formula that 






dz, 


where 


A = fc' = 0, 

ir-aQ 

= Q, 


n 1 (mod g); 
n = 1 (mod g). 


Restricting ourselves to n ^ 1 (mod g) we put 

9iz) ^ a+ b{z- + c(z - «) -f- (z - 
where 3.{z) is analytic in D, Then P„ ~ B + C, where 
» _ 2 /■ O + &(z — at)* -f- c(z — a) 

2iri J-, z"+‘ 


We find 


B = dz + 0{p-") ={bqVa{-ir -b 0(3-'*); 

|°| - -°(\L ^‘■^D - °(""IC«0I)- 


The constant b is determined from the equations 
w — a •= b(js ~ -b • • 


ct 


(» - „)• + 


2o 


Using the fact that 

(Wr'* -f 0(n-‘'\ 

1(3A) I . 

we finally obtain (29) as desired. 



218 


RICHARD OTTER 


Thus P„ approaches zero a little faster than cxpoiiontuilly with w rcRardless 
of Avhether p = 1 or p < 1, except for the special casi' when « = 1. In this 
case it is interesting that, according to the corollary to lemma -1, p = I. 

The case where q 9 ^ 1 is of no practical importance since one can always bring 
q back to 1 by making a very small decrease in one of the non-zero p, and in¬ 
creasing Pi by the same amount. This can elearlj' be done so that none of the 
important characteristics of J{w) is changed appreciably, 


7. The limiting distributions of W(t) and for ie Let us mo¬ 

mentarily drop the condition po 0. The characteristic function of W is 

(30) [e'*^dP 

so that for the rth moment of W we have 


(31) 






r = 0, 1, 2, 


For the first and second moments we obtain 


E{W) =/'(!), 

E(w') = m +r(i). 

which shows that the criterion for extinction (Corollary 3 to Theorem 2) may be 
stated as follows: the multiplicative process is almost certain to expire if and 
only if B(W) < 1. From (30) we see that all the moments of IF will bo finite 
as soon as p > 1; but if p = 1 no general statement can be made, except incase 
0 = 1 also, for indeed a = 1 implies a = 1 so by (31) and (27) E{W) « /'(I) < 1. 

We now reassume po 0. Since the variables Z,Yo ,Yi, • • • arc restricted to 
/ e (Fit is convenient to see what happens to W in (F. If we define g(,w) =■ p~‘/(pui) 
then (13) shows g{w) and g(e") are the generating function and characteristic 
function respectively for TF, given I e (F. Thus wo see immediately that the 
first moment of IF, given (F, is always < 1, and all its moments are finite if p < 1. 
In case 0 < p < 1 we may also introduce h{w) defined by 

(32) f(w) = 'pg(w) + (1 — 'p)h{w), 

then h{w) is obviously the generating function of TF, given Hero the rth 
moment is finite whenever the rth moment of W is finite. (32) gives 

P(1F = V^) = p*^lP*, 

1 - p 

It would be mteresting to be able to compare this with the corresponding thing 
for large finite trees and in this connection we have the following theorem: 
Theorem 5. 7/ a < p and 5 = 1 , 

lim P{W = fc/(F„) = a^:p*o*-‘, fc = 1, 2, ■ • • . 

n-*oe r j • 



THE MDLTXPLICATIVE PROCESS 


219 


Proof. By expanding zf(e'‘9(z)) in powers of z we obtain 
(33) = f:^n(0)a", 

Tl"“l 

where 


MS) 


f c"'"dP - Z S • -P,,, 


so that if P„ 5*^ 0 then P^MS) is the characteristic function of W, given fF„ . 
From (33) we get 

2m Jr a* 

Since a < p we may expand S(e'‘9(z)) about the point £?(z) = a and integrate 
as in the proof of theorem 4, thus 

P;Vn(^) - + e„(fi). 

Since e»(d) —> 0 aa xi oo, 

lim P-Un{S) = «e’V'(oc‘’), 

the limit function obviously being the characteristic function for the distribution 
whoso generating function is awf'iaw), from which the theorem follows directly. 
Now 9(z)/p is the generating function for Z, given 9, and the function solves 

(34) zg(w) — to = 0 

for 1 a I < a. We find for the rth moment of Z 


£?(«'') 


p 


M 


r == 0,1, 


hence all the moments are finite as soon as a > 1. Since by (34) 

dw _ _ ff(to) __ tc 

dz 


1 - Zff'(tp) 2(1 - zg'{w)y 
we obtain for the first moment 

^ - rrWj ■ 


9{z) 

■w= 

P 


l-/(p)‘ 

In a similar way one can express any moment of Z, provided it is finite, in terms 
of f(p)) otc. If a » 1 wo see from the corollary to theorem 3 that even 
the first moment of Z is infinite, except for the special case where p =■ 1 and 

f(1) < 1. 

The characteristic function of F*, ^ven 9, is 



220 


BICHARB OaTEH 


where by ( 21 ) 

M) = f e”"" dP= r p/» .... 

J 2r|—ri J'O 1 Vi \' • ' 

Thus, if P„ 5 ^ 0, PnVjtnC^) is Ihe characteristic function of F*, given ff„. If 
p* = 0 then f*( 0 ) = 1 . If p* ^ 0 put p* =■ e®* then 

AW » 

(35) ^rfr(9)- 

^Qk n-l 

hence 


igw-ECKOT, 

which shows that all moments of F*, are finite if a > 1 , Let us put to 
for short, then, by (18), 


(36) 




_ ZpkW 


1 - zf(w) 
which gives for the first moment of F*, 

P(F*/3^ 




= 

1 -/(p) 


c'pa«»'''\9'(2), 

PkP^-'EiZm, 


9Xz), 


which is to be expected since pitp*”"^ plays the same role in ff that pk plays in IT. 
We may also expect that for L i‘Jn , n *F* should be closely related to p* . This 
question is settled by the following theorem: 

Theorem 6. If a K p and 5 = 1 then for x real 


YmP(n-^Yk < x/%) « | ^ ap*a*"*; 

1 0, i/ a: < apkO*“\ 

Proof. We intend to estimate the rth moment of rT^Yk for t e fK and n 
very large from (35) by means of the contour integral 


(37) 

So let us put 


E(n-^ym 


1 

2vm'P„ 




w-g>(z), ns = 0,1. , 

then by (36) ooi = and by Leibnitz formula, provided k pi 0, 


(38) 


Wr ^ Pk 


2 (^-1)1 
1 Vo I I'll ••• Vtl 






w 


(1) 

’ 



THE MULTIPLICATIVE PROCESS 


221 


The principal contribution to the integral in (37) will come from the term of 

(38) which has the largest size for z near a. If we put f = (z - a)''“ then w is 

regular at f = 0 and so la the constant pk. Let’s assume that w, has a pole of 
order 2>' — 1 at f = 0 for r = 1, 2, ■ ■ ■ r — 1, which is clearly true for r = 1. 
Then if s is the number of n , cj ^ which are = 0, the order of the pole 

of the general term of (38) at f = 0 is 

k —1 

(2ri - 1) + s + 2p* + 1 = 2(r - Po) - {k ~ s), 

I"! 

which has the maximum value 2r — 1 if and only if po = Pi = • ■ • p^-i = 0, 
Pi = r — 1. Hence 

(39) Wr = + f*~*’flti(f), 

where f^(t) is a regular function of f at zero. For /c = 0 the formula (38) is not 
correct but it is easy to see directly that (39) is correct for /r = 0. If we derive 
(39) with respect to 2 and put r - 1 for r we obtain 

wii’i = + r'"“''9t=(f), 


hence 


Substituting in (37) and estimating in a manner similar to that employed 
previously wo obtain 


E(n'Y;/tfn) 


(p.a'-'y f ^ ,,, f (2 - ay-'a^((s - 

2irin'p„ Jr2"~‘’+' Jr 2irm'’p„z'’+‘ 


/„ Pn-r (n 1 “ r) (n - T - l) • • • (ii - 2r + l) 
(p,a ) ^- 




and finally 


lim E(n-'Y;rJ'„) = (apta^y. 

rt-*« 


The limit of the rth moment is itself the rfch moment of the distribution on the 
real line which has all its mass at the point ap*a*^'. Since this distribution is 
uniquely determined by its moments, a well known theorem [7] enables us to 
conclude that our sequence of distributions has this distribution as limit and 
this is equivalent to what is claimed by the theorem. 

It is important to notice that if we put the mass at the point k this 

determines a distribution on the real line because of (26). 


8. The estimation of p. If we wish to estimate p when we know p 0, we 
may obtain an estimate from the knowledge of f{w) in 0 < ui < 1, using the 
method of iteration. That is we choose a function G{wi) such that G(p) = p 
and I G(w) - pl<|w — p|for0<w<l. Then if for any wo m the open 



222 


BICHABD OITEB 


interval ve compute successively where » (7(ie„) for n > 0 

we aie sure tliat Wn converges exponentiallj’- to p as n —♦ oo, 

Obviously f{w) itself has the properties of (?(u)) but we achieve faster con¬ 
vergence to\varcls p using Newton’s method, that is if we put 


WO) 


fl(w) = f(w) - w, 


0(w) ~ w — 


Mw) 

fi(w) ■ 


If for some reason we expect p to be dose to 1 then it is better to put 

and use/ 2 (uj) in (40) instead of fi(w), for then we may choose ii?o = 1. 
Let U3 put /'(I) = 1 + €, « > 0 tlien 

/,(! + « = - 1 -V,, A _ 0; 

fi(l + A) = lim - 1(1+ " A 

1-.0 \ k(A k) kh J ’ 

w 


m 


lim / /w + 2h) - 2/(1 + h) + l \ 

h-O \ W J 


ni) 

2 


Hence 


(41) 


P « Ml == 1 


2c 

FIX)' 


This result was previously established by Kolmogoroff [7]. 

The following two simple examples display the results of the general theory. 
Example 1. We take f{w) = po + PiM -j- pjw* where Pp + pi -f- pj “ 1 
and Po, Pj > 0. We have p = oo. From the equations (20) and (27), 


/(o) = po -H pia + pjo'* 


a 

"" t 

a 


= Pi + 2p2a « i, 

ct 


we obtain easily 


® Vpopr*, « ' =■ Pi + 2 \/poPj , 


and It is evident that a > 1 is equivalent to p„ > p, ie equivalent to /'(I) « 

Pi -f- Zpi < I, jNow 


0(z, m) _ zpQ -j- (pp^ — _|, zpiu?^ 



TUB multiplicative PE 0 CEIS 6 


223 


hence 

(42) 


tp(z) = 4 gpi — ^(1 — gpi)^ - 4z^pop3 
2zp2 


the choice of the sign of the radicfil heing determined by letting z —> 0 , 


7?o 4- Pa - V (po - _ 

2p= 


PoPi , 


Pa > Pi; 
Po < Pi. 


In the case pi > 0 n e have 5 = 1 and then by (21) 

Pl‘pl‘pl\ 

»1+Cf2«"n—1 

which can also be obtained by expansion of (42) according to powers of 2 . From 
(29) wc get 

P" ~ ^ (PiV" popl^ + ^PopT) (Pi H" 2\/popj)" . 

In the case pi = 0 we have 5 = 2 and obtain from (42) or from (29) 

9'(2) = i(~ l)'^^(^)2^'-*pSpr'i 

_ f (2r -2)1 




which shows 


P« = 


0 

(2r - 2)1 1 

PoPt , 


(^(r - 1)1 

By direct use of Stirling’s formula or from (29) we get 


n =2 i>; 
n = 2 r — 1 . 


P2-1 ~ - 4/^2''''(PoP2)’'(2<' - 1)’'*. 

Pi y r 


JtxAMPLE 2, We take f(tv) = X > 0, so that W has a Poisson distribu¬ 
tion. Then p = 00 , 5 = 1 , and we get from (26) and (27) 


,Mii-l) 


/(a) = c*'"-’ = a/a, 


f(a) = = 1/a, 

a = 1/X, 


X-l /. 

a = e /X. 


Clearly wc have a > 1 if and only if X < 1 and in this case 1 is evidently the 
only solution for w of = w, hence p = 1. On the other hand if X < 1 



224 


mCHARD OTTER 


then (41) gives p = 1 - 2(\ ^ l)\~\ By (21) we get 

__ {nlT~^ -n\ 


and by direct use of Stirling’s formula or from (29) we get 






REFERENCES 

[1] R A. Fisheb, The Genelicd Theory of Natural Selection, Oxford, The Clarendon Presa 

1030. 

[2] A J Lorn, Theorie Analytique des Assomlions Btologiques 2 , Hermann and Co., 

Pans, lok 

[3] D. Haivkins and S. Ulam, Theory of Mvlhplmtm Processes 1 , MDDC-287,1944. 

[4] T, E, Harris, “Some tliGoroms on the Bornoullian multiplicative process," thesis, 

doctor of philosophy, Princeton University, 1947. 

[5] A M YAGLOM,“Ceitainlimittheorcmsofth 0 theory of branching random processes," 

Dokkdy Akad Nauk SSSR{N. S,) Vol. 66 (1947), 795-798. 

[6] D. G, Kendall, “On the generalized “birth-and-death" process," Annals of Math, 

Stat., Vol. 19 (1948). 

[7] A. Kolmogoropp, “55ur Losung einerbiologischen Aiifgabe," Milt, Porsch,^Imt. Math, 

u, Mech Univ. Tomsk, Vol. 2 (1938), pp. 1-6. 

[8] L. Pontrjagin, Topological Groups, Princeton Univ, Press, 1946. 

[9i J, VON Ned.vann, Functional Operators, mimeographed notes. Institute for Advanced 
Study, 1933-36 

[10] A Kolmogoropp, Qrundhergriffe der Wahrscheinlkhkeitsrechnung, Chelsea Publishing 

Co,, New York, 1946. 

[11] A, Hdrwitz and R. Couhant, Funklioneniheoru, Springer, Berlin, 1929. 

[121 M G. Kendall, The Advanced Theory of Statistics, Vol, I, Griffin Co,, London, 1943. 


I 



APPLICATION OF THE RADON-NDCODYM THEOREM TO THE 
THEORY OF SUFFICIENT STATISTICS^ 

By Paul, R, Haumos* and L. J. Savage 
Univeraily of Chicago 

Summary. The body of tliis paper ia written in terms of very general and 
abstract ideas which have been popular in pure mathematical work on the theory 
of probability for the last two or three decades, It seems to us that these ideas, 
so fruitful in pure mathematics, have something to contribute to mathematical 
statistics also, and this paper is an attempt to illustrate the sort of contribution 
wo have in mind. The purpose of generality here is not to solve immediate 
practical problems, but rather to capture the logical essence of an important 
concept (sufficient statistic), and in particular to disentangle that concept from 
such ideas as Euclidean space, dimensionality, partial differentiation, and the 
distinction between continuous and discrete distributions, which seem to us 
extraneous. 

In accordance with these principles the center of the stage is occupied by a 
completely abstract sample space—that is a set X of objects a:, to be thought 
of as possible outcomes of an experimental program, distributed according to an 
unknown one of a certain set of probability measures. Perhaps the most familiar 
concrete example in statistics is the one in which X is n dimensional Cartesian 
space, tlic points of which represent n independent observations of a normally 
distributed random variable with unknown parameters, and in which the 
probability measures considered are those induced by the various common 
normal .distributions of the individual observations. 

A statistic is defined, as usual, to be a function T of the outcome, whose 
values, however, are not necessarily real numbers but may themselves be abstract 
entities. Thus, in the concrete example, the entire set of n observations, or, 
less trivially, the sequence of all sample moments about the origin are statistics 
with values in an n dimensional and in an infinite dimensional space respectively. 
Another illuminating and very general example of a statistic may be obtained as 
follows. Suppose that the outcomes of two not necessarily statistically inde¬ 
pendent programs are thought of as one united outcome—then the outcome T 
of the first program alone is a statistic relative to the united program, A 
technical measure theoretic result, known as the Radon-Nikodym theorem, is 
important in tlio study of statistics such as T. It is, for example, essential 
to the very definition of the basic concept of conditional probability of a subset 
E ot X given a value y of T. 

The statistic T is called sufficient for the given set 5111 of probability measures 

* This paper was the basis ot a lecture delivered upon invitation of the Institute at the 
meeting m Chicago on December 30, 1947. 

* Fellow of the John Simon Guggenheim Memorial Foundation, 

225 



226 


PAUL B. HALMOS AND U. J. RAVAGE 


if (someAvhat loosely speaking) the eoiiditional pi'ohaliility nf a subset E of X 
given a value y of T is the same for every probability measure in 3W. It is, for 
instance, well Icnown that the sample mean and variance together form a sulheient 
statistic for the measures described in the concrete OKainple. 

The theory of sufficiency is in an especially .satisfactory stale for the tvise 
in which the set SO? of probability mca.surcs satisfies a eciiain condition described 
by the technical term dominated. A set SO? of probability measures is called 
dominated if each measure in the set may be expreased as tlie indefinite integral 
of a density function with respect to a fixed measure which is not itself necessarily 
in the set It is easy to verify that both classical extremes, commonly referred 
to as the discrete and continuous oases, are dominated. 

One possible formulation of the principal result conceniing sufficiency for 
dominated sets is a direct generalization to the abstract case of the well known 
Fisher-Neyman result; T is sufficient if and only if the densities can be written as 
products of two factors, the first of which depends on the outcome through T 
only and the second of which is independent of the unknonm measure. Anotlicr 
way of phrasing this result is to say that T is sufficient if and only if the likelihood 
ratio of every pair of measures in ®? depends on the outcome through T only. 
The latter formulation makes sense even in the not necessarily dominated cose 
but unfortunately it is not true in that case. The situation can bo patched up 
somewhat by mtroducing a weaker notion called pairwise sufficiency. 

In ordinary statistical parlance one often speaks of a statistic sufficient 
for some of several parameters. The abstract results mentioned above can 
undoubtedly be extended to treat this concept. 

1. Basic definitions and notations. A measurable space (A, S) is a sab X 
and a <r-algebi'a S of subsets of X? If {X, S) and (V, T) arc measurable 
spaces and if T is a transformation from X into Y (or, in other words, if T 
is a function with domain X and range in F), then T is measurable if, for every F 
in T, T \F) fS. If y is a Borel set in a finite dimensional Euclidean space, 
then we shall always understand that T is the class of all Borel subsets of Y, 
and the measurability of a function / from X to F will be expressed by the 
notation /(e) S. 

Throughout most of what follows it will be assumed that {X, S) and (F, T) 
are fixed measurable spaces and that T is a measurable transformation (also 
called a statistic) from X onto Y. A helpful example to keep in mind is the 
Cartesian plane in the role of X, its horizontal coordinate axis in the role of F, 
and perpendicular projection from X onto F in the role of T. 

The following notations will be used. If is a point function on F (with 
arbitrary range), then gT is the pomt function on X defined by gT^x) = giTix)). 
If fi is a set function (with arbitrary range) on S, then is the set function 

’ A <r-algebra is a non empty class S of sets, closed under the formation of complements 
and countable unions. If (X, S) is a measurable space, the sets of S will be called the 
measurable sets of X 



SUFFICIENT STATISTICS 


227 


on T defined by yJT \F) = The class of all sets of the form T'\F), 

with F e T, will be denoted by T~\T). the characteristic function of a set A 
(in any space) will be denoted by x-i • 

Lemma 1. If gin any funvlion on Y and A is any set in the range of g, then 

{x‘.gT{z) *.i| = T^\\y:g(y) eA]); 

hence, in particular, xr-Hri =“ xA' for every subset F of Yf 

Phoof. The following statements are mutually equivalent: (a) .to « 
(x: gTix)eA\, (b) sr(7'(xa))€A, (c) if t/o = T(.To), then givo)eA, and (d) 
T{xo) t {y: giy) <syl|. The equivalence of the first and last ones of these 
statements is exactly the assertion of the lemma. 

We shall have frequent occasion to deal with functions on X which are induced 
by measurable functions on 7; the following result is a useful and direct structural 
characterization of such functions. 

Lemma 2. If f ts a real valued function on X, then a necessary and siificient 
condiiicn that there exist a measurable function g on Y such that f = gT is that 
f (e) T~\T)-, if such a function g exists, then it is uniquef 

Proof. The necessity of the condition is clear. To piove sufficiency, 
suppose tlxat / (<) ya e Y, and write X^ = r”^({j/ol)- Suppose Xo e X^ 

and write E « (a?:/(®) “ fixo)\- Since/ (<) T~\T), there is a set P in T such 
that E » T'^iF). Since x<i < E, it follows that yo (F and therefore that 

JlTo ■= T-\{yo}) C T-\F) = B. 

In other words / is a constant on Xo and consequently the equation g(yf) = /(xo) 
unambiguously defines a function g on 7. The facts that f — gT and that g is 
measurable are clear; the uniqueness of g follows from the fact that T maps 
X onto 7. 

2. Measures and their derivatives. A measure is a real valued, non negative, 
finite (and therefore bounded), countably additive fimction on the measurable 
sets of a measurable space.* An integral whose domain of integration is not 
indicated is always to be extended over the whole space. If the symbol 
[m], pronounced “modulo u”, follows an assertion concerning the points x of 
X, it is to bo understood that the set B of those points for which the asser¬ 
tion is not true is such that E tS and g(S) = 0. Thus, for instance, if / 
and g are functions (with arbitrary range) on X, then / = p [pi] means that 

‘The symbol {—- : -—I standii for the set of all those objects named before the colon 
which satisfy the oondition stated after It. 

‘ The notation / (0 T“‘(T) means of ooutee that / is a meoeurable function not only on the 
measurable space {X, S) but also on the measurable space {X, T“‘(r)). The restriction to 
real valued funotioas is inessential ond is made only in order to avoid the introduction 
of more notation. 

• Although most of the measures occurring in the applications of our theory are probability 
measures (i.e. measures whose value for the whole space is 1), the consideration of probabil¬ 
ity measures only is, in many of the proofs in the sequel, both unnecessary and insufficient. 



228 


PAUL H. HALM09 AND L. J. SAVAGE 


fix) 5 ^ g(x)}) = 0. Similarly, if / is a real valued function on X, then 
f (e) T~\T) [m] means tliat there exists a real valued function g on X such 
that g (e) T~\T) and/ = g H- 

If n and V are tv'o measures on S, v is absolutely continuous with respect to u, 
in symbols v « m, d i'(JS) = 0 for every measurable set E for which m(-B) = 0. 
The measures n and v are equivalent, m symbols ^ ^ if simultaneously m « r 
and V « One of the most useful results conceming absolute continuity is the 
Radon-Nikodym theorem, which may be stated as follows.* 

A necessary and sufficient condition that v ju ts that there exist a non negative 
function f on X such that 

v{E) = f f(x) dn(x) 

Jb 

for every E in S. The function f is unique in the sense that if also 

viE) = f g(x) dyix) 

Je 

for every E in S, then f = g [g]. If p(E) ^ niE) for every E in S, then 0 g f{z) 
g 1 [m]. 

It is customary and suggestive to write / = dvfdn. Since dv/du is determined 
only to within a set for which n vanishes, it follows that in a relation of the form 

(e) T\T) In] 

the symbol Ui] is superfluous and may be omitted. 

For typographical and heuristic reasons it is convenient sometimes to write tlie 
relation / = dv/dn in the form dv = fdn; all the properties of Radon-Nikodym 
derivatives which are suggested by the well known differential formalism cor¬ 
respond to true theorems. Some of the ones that we shall make use of are 
trivial (e.g, dpi = fidy. and dv 2 = fidy imply d{vi -f vij = (/^ -f f 2 )diJ.), while 
others are well knoivn facts in integi'ation theory (e.g. (i) = fdv and dp = gdy. 

imply dK = fgdij., and (u) dv — fdy. and dy, = gdv imply fg ~ I M). 

We conclude this section ivith a simple but useful result concerning the 
transformations of integrals. 

Lemma 3, If g is a real valued function on Y and y is a measure on S, then 
f g{y)dyT-\y)^ f gT{x) dy{x) 

for every F in T, in the sense that if either integral exists, then so does the other and 
the two are equal. 


’ It is clear that the relation of equivalence is reflexive, synametric, and transitive, 
and hence deserves its name. 

* For a proof of the Radon-Nikodym theorem and similar facts concerning the measure 
and integration theory which we employ, see S. Saks, Theory of the Integral, Warszawa— 
Lw6w, 1937. 



SUSnciENT STATISTICS 


229 


Proof. Replacing fir by sfxj- we see that it is auificient to consider the case 
F = 7. ^ 'Hie proof for tiiis case folUnvs from the obsei-vation that every ap¬ 
proximating mun 

of fgdi^T • is also an approximating sum 

gTixME.) 


of j gTdn, and conversely.® 

3. Conditional probabilities and expectations. Lemma 4. If n and v are 
measures on S sueh thal v p,, then vT~^ <K pT'^. 

Proof. If F e T and 0 - pTr\F) = piT-\F)), then 

0 - v(Tr\F)) = vT-\F)}^ 

Ijcmma 4 is the basis of the definition of a concept of great importance in 
probability theory. If m is a measure on S and / is a non negative integrable 
function on X, then tho measure v defined by dy = fdp is absolutely continuous 
with respect to p. It follows from Lemma 4 that pT~'' is absolutely continuous 
with respect to pT'^] wo write dvT~'^ « gdpT~\ Tho function value g{y) is 
known as the conditional erpectalion of / given y (or given that T(x) — y) , we 
shall denote it by \y). If / “ x« is the characteristic function of a set F in 
S, then e^CT 1 y) is known as the conditional probability of E given y\ we shall 
denote it by pJjF \ j/)." 

The abstrfict nature of these definitions makes an intuitive justification of 
them desirable. Observe that since — v{T~\F)) = [. f(,x) dp(x), 

JT V) 

the defining equation of e^(f \ y), ^vritten out in full detail, takes the form 

f-, [ eMly^dpT^^iy), F^T. 

Jr (y) Jy 


• It is of interest to observe that cither side of the equation in Lemma 3 may be obtained 
from the other by the formal substitution y “ T(x), A spooial ease of this lemma is tho 
oelebratod and often misunderstood ossertion that tho expeotation of a random variable is 
equal to the first moment of its distribution function. 

That tho converse of Lemma 4 is not true is shown by tho following example. Let X be 
the unit square, let Y be the unit interval, and let T be the perpendicular projection from 
X onto Y. Let n be ordinary (Borol-Lobosgue) measure and let v be linear measure on tho 
intersection of X with, say, tho horizontal line whose ordinate is J-. Clearly v is not abso¬ 
lutely continuous with respect to p, but yT~^ = pT~’-. 

“Definitions in this form were first proposed by A. Kolmogoroff, Orundbegriffe der 
WahracheinlichkeiUrechnung, Berlin, 1933. With a slight amount of additional trouble, 
conditional expectation could be defined for more general functions, but only the non 
negative case will occur in our applications. 



230 


PAUL E. HALMOS AND L. J. SAVAGE 


If / = Xfi j theD this equation becomes the defining equation of Pn(E | y)-, 

y{E n T-\F)) = £ 1 y) dyT'^y), FtT}^ 

The customary definition of “the conditional probability of i? given that fix) eF" 
is ii{E n T~^{F))/ti{T'^{F)), (assuming that the denominator does not vanish). 
Since ni'F'^iP)) = yT~''{F), we have 

It is now formally plausible that if "F shrinks to a point y,” then the left side 
of the last written equation should tend to the conditional probability of E 
given y and the right side should tend to the integrand | y). The use of 
the Radon-Nikodym differentiation theorem is a rigorous substitute for this 
rather shaky difference quotient approach. 

Since j y) is determined, for each E, only to within a set for which 
vanishes, it would he too optimistic to expect that, for each y, it behaves, regarded 
as a function of E, like a measure. It is, however, ea,sy to prove that 

(i) p,{X 11/) = 1 {nT-% 

(ii) 0 i p,iE I y) ^ 1 \yT~% 

(iii) if is a disjoint sequence of mensurable sets, then En\y) =■ 

The exceptional seta of measure zero depend in general on E in (ii) and on the 
particular sequence j®,} in (iii). It is interesting to observe that, despite the 
fact that a need not be a probability measure, turns out always to have the 
normalization property (i). It is natural to ask whether or not the indeterminacy 
of 1 y) may be resolved, for each E, in such a way that the resulting function 
IS a measure for each y, except possibly for a fixed set of y’s on which fiT~^ 
vanishes. Doob*‘ has shown that this is the case when X is the real line; in the 
general case such a resolution is impossible. Fortunately, however, conditional 
probabilities are sufficiently tractable for most practical and theoretical purposes, 
and the requirement that they should behave like probability measures in the 
strict sense described above is almost never needed. 

"We observe that it is not sufficient to require this for F ^ V only, i.e, to require 
/‘(■S') = S Pii(E I y) dftT~^{y). This special equation is satisfied by many functions which 
do not deserve the name conditional probability; e.g. it is satisfied by puCB I v) « 
constant = fi{E)/yT~KY). j fxy \ 

'• See J. L. Doob, “Stochastic processes with an integral-valued parameter," Am. Math. 
Soc. Trans., Vol. 44 (1938), pp, 96-98. 

** See Doob, loo. cti. Doob asserts the theorem in much greater generality, but his 
proof is incorrect The error in the proof and a counterexample to the general theorem 
were communicated to us by J Dieudonii6 in a letter dated September 4, 1947. Iloob’s 
proof IS valid for more general spaoes than the real line (e.g. for finite dimensional Euclidean 
spaces and for compact metric spaoes). The details of Dieudonnd’s counterexample will 
appear in a forthcoming book (entitled Meosure theory) by Halmos. 



8UPKCIENT STATISTICS 


231 


We conclude this section with two easy but useful results which might also 
serve as illustrations of the method of finding conditional probabilities and 
expectations in certain special cases. 

Lemma 5. If m is a measure on S, if g is a non negative function on Y, integrahle 
mth respect to and if v is the measure on S defined by dv = gTdy, then 
dvT’' « gdiiT^, or, equivalently, e„{gT [ y) •= g{y) 

Pnooi'. From v(E} » / gT{x) dfi(x) and Lemma 3 it follows that 

Jfi 

vT-'(F) - p{T-\F)) =. j^g(ii)d^T\y). 

Lemma 6. If pis a measure on S, iff and g are non negative functions on X and 
Y respectively, and if f, gT, andf-gT are all integrahle with respect to p, then 

eM(f-g'F\y) = e,(f\y)-g(y) [aP'’]. 

Hence, in parliadar, if F < T, then 

p,(E n r‘(F) I v) « p,(E I y)xr(y) [m^ 

for every E in S, 

Proof. If di' •=* fdy, then, by definition of e,., vT~^{F) = j" e,.(/1 y) dpT'\y). 
Applications of Lemmas 3 and 5 yield 

I e,(f\y)g(y)dpr^(y) - / g{y)dyT-\y) = / gT{x)dv{x) 

- I _ f{x)gnx) dpix) « [ e,if‘gT\v)dpT-'{y), 

and therefore the desired conclusion follows from the uniqueness assertion of the 
Radon-Nikodym theorem. 

4. Dominated sets of measures. In many statistical situations it is necessary 
to consider simultaneously several measures on the same tr-algebra. The 
concept of absolute continuity is easily extended to sets of measures. If 9W 
and 91 are two sets of measures on S and if, for every set .F in S, the vanishmg of 
p(E) for every p in SDl implies the vanishing of viE) for every v in 91, then we 
shall call 91 absolutely continuous with respect to Wl and write 91 « 9)1. If 
91«SDl and 991« 91, the sets 911 and 91 are called equivalent and we write 951 s 91, 
If, in particular, 931 contains exactly one measure p, 911 = {m}, the abbreviated 
notations 91 ■‘K p, p « 91, and p » will be employed for 91 <5C 991, 9)1 <SC 91, and 
9)1 s 91, respectively, 

A set 991 of measures on S will be called dominated if there exists a measure X 
on iS (not necessarily in 2)1) such that 9)1« X. In applications there frequently 
occur sets of measures which are dominated in a sense apparently weaker than 
the one just defined—weaker in that the measure X, which may for instance be 



232 


PAUL B. HALMOS AND L. J. SAVAGE 


Lebesgue measure on the Borel seta of a finite dimensional Euclidean space, 
is not necessarily finite. It is easy to see, however, that whenever X has the 
property (possessed by Lebesgne measure) that the space X is the union of 
countably many sets of finite measure, then a finite measure equivalent to X 
exists and the two possible definitions of domination coincide. 

The following result on dominated sets of measures may be found to have 
some interest of its own and will be applied in the sequel. 

Lemma 7. Every dominated set of measures has an equivalent countable subset. 

Peooe. Let 5!J} be a dominated set of measures on S, 9)i <jC X; for any n in 3K 
write/„ = dfi/dh and Kf, = >0). We define (for the purposes of this 

proof only) a kernel as a set if in S such that, for some measure K Cl 

and ti{K) > 0; we define a chain.&& a disjoint union of kernels. Since X(X) > 0 
for every kernel it follows from the finiteness of X that every chain is a countable 
disjoint union of kernels. It follows also from these definitions that if (7 is a 
measurable subset of a chain, such that /r((7) > 0 for at least one measure a in 9)i, 
then C is a chain, and that a disjoint union of chains is a chain. The last two 
remarks imply, through the usual process of disjointing any countable union, 
that a countable (but not necessarily disjoint) union of chains is a chain. 

Let {C,j be a sequence of chains such that, as j > oo, X(C/) approaches 
the supremum of the values of X on chains. If C = U".i C;, then C is a chain 
for which X(C) is maxunal. The definition of a chain yields the existence of a 
sequence [Ki] of kernels such that C = K {, and the definition of a kernel 
yields the existence, for each i == I, 2, • • • , of a measure Hi m ^ such that 
K, C and miKf) > 0. We write 91 = , • • •); since 91 C 3)^, the 

relation 91 « 9K is trivial. We shall prove that « 91. 

Suppose that E eS, a»(E) = 0 for i = 1, 2, ■ • • , and let a be any measure in 
5D1. It is to be proved that n{E) = 0. Since H(bl — Kf) = 0, there is no loss of 
generality in assuming that B <Z . If h(E - C) > 0, then \(E - C) > 0 
and therefore (since E - G ie a kernel) JS u (7 is a chain with X(£/ u C) > X{G). 
Since this is impossible, it follows that n(E - C) = 0. Since 0 = Hi(.E) == 

Hi{B r\Ki) = / dX and since K, C iC„ , it follows that \(E n 2C<) = 0. 

We conclude that \GE n (7) = n = 0 and therefore h(.E n (7) = 0. 

Since n(E) = a(E — C) + ii{E n C), the proof of the lemma is complete. 


6. Sufficient statistics for dominated sets. The statistic T is sufficient for a 
set an of measures on S if, for every E in 5, there exists a measurable function 
p = p{E\ y) on Y, such that 


Pe(.E I y) == P(E I y) [iiT~^] 

for every m m gR In other words, T is sufficient for 911 if there exists a condi- 

» The ongmal definition of sufficiency was given by R. A. Fisher, "On the mathematical 
statistics," Boy. Soo. Phil. Tram., Series A, Vol. 222 <1922), 



SUmciENT STATISTICS ooo 

■AUO 

tional probability function common to evci'v u in TO nr. i i 

conditional distribution induced by T is independent of ^ ^Pealcmg, if the 

SD? ^ X and .uoh Ihnt l/dK (.) t 

to SI (taoma 7), .m.l „rite X tor tho de. Jo 

“ j;7_i ao<,(^), 

rvhero a, = 1/2V,(.Y), i - I, 2, ■ ■. . Clearly m = X 

ovfj/Jr ““ *“ <■ - 

X(B n T~\F)) = a,ix,(E n T~\F)) 

- rr-i 0, f p(E I V) <ia,T-‘(y) ~fp[E\y) Jxr-(y), 

i.e, p serves also as a conditional probability for X 
T*c aoy fixed a in Si, mite d„/<ix _ and elt/l j) _ »(,). aen d«T- _ 

i/dxr^ , and we have, for eveiy in S, tnen a^ti _ 

m d\{x) - ■=, I p(E\ y) dixT-\y) 

* / P(^!- 12/)?(2/) d\T~\y) = j* c^(xi, I y)ex(( 72 ’ | y) dXT~^{y) 

= / exCxff'g'r I y) d\T~\y) = J XB{x)gT(x) d\(x) = £ ^^(a;) dx(a;), 

The desired r^ult, /(x) = gT(x) [X], foUows from a comparison of the first and 
last terms m the last witten chain of equations 

prove that p, is a conditional probability 

dvT~'' =n p^-gdX-T^^, 

On the other hand dv == xndp ^ xs'gTdX, so that 

dvT~^ = exdXT"^ 

where e, = e,(x..gT\y) = px(£; 1 3 /)^(y). It follows from a comparison of 
the two expressions for dyT~^ that uxpanson oi 

I yMy) = p\{E 1 y)g{y) [Ar"!], 



234 


PAUL H. HALMOS AND L. J. SAVAGE 


Since the relation dtiT'^ = gdK'T^ clearly implies that giy) 9^ 0 \p.T'^\ (i.e. 
that nT~^{{y\ g{y) = 01) = 0), it follows, finally that 

Vii(^ I y) = p>.{B I y) 

6. Special criteria for suflSciency. Theorem 1 may be recast in a form more 
akin in spirit to previous investigations of the concept of sufficiency.*® 

CoEOLLAHY 1. A necessary and sufficient condition lhal the statistic T he 
sufficient for a dominated set W. (« Xo) of measures on S is that, for every n in 2U, 

= du/d'Ko he factorable in the formf^. — g^'t, where 0 ^ p,, (e) T *(T), 0 ^ t, t 
and gn • t are integrable vnth respect to Xo, and t vanishes [Xo] on each set in S for 
which every piniSl vanishes. 

In more customary statistical language the condition asserts essentially that 
‘*each density is factorable into a function of the statistic alone and a function 
independent of the parameter.” 

Peoop. If T is sufficient for 3U, then there exists a measure X with the 
properties described in Theorem 1. It follows that 

, _ du _ dfi dh 
^ dXfl dX dXo 

and we may write g^ = dy/d\ and t = dX/dXo • The only assertion that is not 
immediately obvious is the one concerning tho vanishing of t. To prove it, 
suppose that uiE) = 0 for every a in M', the fact that then 

0 = \(E) = f i{x) dXo(x) 

JjH 

implies the desired conclusion. 

If, conversely, = p,,-1, then we may write dX = fdXo. The relation ifft ^ \ 
follows from the statement concerning the vanishing of i, and the relation 
dfi/dh (f) T~^(T) is implied by the equation dp ■= g^-tdhe = p„dX. 

For the statement of the next consequence of Theorem 1 it is convenient to 
call a set 5Dl of measures on S homogeneous if s v for every p and v in SK. 

Coeollaey 2. A necessary and sufficient condition that the statistic T be 
sufficient for a homogeneous set 2Ji of measures on S is that, for every p and v in Sfl, 
dr/dp (,) T\T). 

Proof. Since a homogeneous set is dominated (by any one of its elements), 
Theorem 1 is applicable. If T is sufficient for SD? and if X has the properties 
described in Theorem I, then dv/dp = (,dv/dK)/(,dp/d\). The converse follows, 
through Theorem 1, by letting X be any measure in HJl. 

We shall say that the statistic T is pairwise sufficient for a set SK of measures 

'• See J. Neyman, “Su un teorema oonoarnente le ooeiddette statisticho sufiBolentl,” Insl. 
Hal, Atlt. Oiorn,, Vol, 6 (1935), pp. 320-384, In this paper Neyman is somewhat restricted 
by hie use of classical analytical methods, but he points out the possibility and desirability 
of extending his results to a much more general domain. For a recent presentation of the 
theory and further references to the literature of. H Cramdr, Mathematical Methods of 
Statistics, Princeton, 1946 



SUmCIENT STATISTICS 


235 


on S if it is sufficient for every pair {;x, v) of measures in SOI. In other words. 
T is pairwise sufficient for SOI if, for every EinS and n and v in SOI, there exists a 
measurable function p„.{B | y) on F .such that 

V^{E I y) “ vUE ! y) [mF'*] and p,(£ \ y) = p^,{E 1 y) [rF”*]. 

Since pairwise sufficiency is (at least apparently) weaker than sufficiency, it is 
not surprisinR that there is a simple criterion for it even in the case of quite 
arbitrary (not necessarily homogeneouB or dominated) sets of measures. 

CoRonnARY 3. /I necessary and sufficteni cmdilion that T be pairwise sufficient 
for a set SW of measures on S is t}iai, for any two measures u and v m SOI, dyfdip + r) 
(e) T~\T). 

Proof. If T is sufficient for m and v, then there exists a measure x s + v 
such that dy/dK («) T~'^(T) and dv/dX («) It follow^s that 

_ dfi ^ >') _ ^ A. 

d’iy + v) dx/ dX dx/ \dx'^dX/' 

The sufficiency of the condition follows immediately by applying Theorem 1 
to the two-eloment set {m, s) . 

7. Pairwise sufficiency and likelihood ratios. It is sometimes convenient to 
express the result of Corollary 3 in slightly different language. If X is a measure 
on S and if / and g are real valued measurable functions on X such that 
X((a::/(.Y) =» (;(*) =» 0)) = 0, we shall say that the pair (f, g) is admissible [X]. 
(Intuitively an admissible pair (f, g) is to bo thought of as a ratio f/g, which, 
however, may not bo formed directly at the points x for which g(z) = 0.) Two 
admissible pairs (fi , gi) and (/a, gf) will bo called equivalent [X], in symbols 
(fi) ?i) “ (/j) Bi) 1^1 1 if there exists a real valued measurable function t on X 
such that i{x) 9 ^ 0 [X] and such that/i = f/j and gi = tgt [X]. It is clear that the 
relation “ sa [x]’* is indeed an equivalence; the equivalence class containing the 
admis-sible pair {f, g) will be called the ratio of / and g and will be denoted by 
f\g. (A ratio may accordingly be described as a measurable function from X 
to the real projective line.) For a ratio /1 gr M'e shall write f\g{s) T~^i.T) [X] 
if the equivalence class f 1 g contains a pair (/o, ffo) which is admissible [X] and 
for which/o («) T~\T) and go («) T-\T). 

Lemma 8, If y, v, Xi, and Xj are measures on S such that y + r « Xi and 
y + V «hi, then the pairs (dy/dki, dv/dhf) and (dy/d\i, dv/dXf) are admissible 
[m + r] and equivalent I/x + H* 

Proof. The admissibility of, for instance, {dy/d\i , dv/dXi) follows from the 
fact that dy/dXi 5*^ 0 [m] and dv/d\i 9^ 0 [v], whence 

To prove equivalence, we write Xi + Xj = X. Since 

dy dX], _ dy _ dy <^2 dv dXj. _ dv _ dv dXo 

dXi dx 5 x ~ dXi d \' dXi dX dX dXo dX ' 



236 


PAUL H. HALMOS AND L. J. SAVAGE 


since also dKi/dX 9 ^ 0 [Xi] and therefore dkifdX 0 [;* + ;<], and since, similarly, 
dXj/dX 0 [/i + v], the conditions of the definition of equivalence are satisfied 
ty f = (dX 2 /dX)/(dXi/dX) 

If n and V are any two measures on S and if X is any measure on S such that 
H + V <^ \ (for instance if X = + p), then the ratio d/i/dX [ dv/d\, which 

according to Lemma 8 exists [^i + v] and is independent of X, will be called the 
likelihood ratio of n and v and will be denoted by dn | dv. The result of Corollary 
3 may be expressed m terms of likelihood ratios as follows. 

Theorem 2. A necessary and sujfurient condition that T he pairwise sufficient 
for a set Sli of measures on S is that, for any two measures y, and v in 3)1, 
dy 1 dv (e) T-\T). 

Proof. If T is sufficient for y and v, then, by Corollary 3, dy/d{y + i') (e) 
T~^(T), dv/d(y + v) («) T~^(.T), and, by Lemma 8, {dy/d{y + p), dy/d(y + v)) is 
an admissible pair belongmg to the equivalence class dy\dv. Suppose conversely 
that f = dy/d{y + v), g — dv/d{y + j'), and let the real valued measurable 
functions t, /o, and gc be such that i 0 [/i + p], /o (e) T~^{T), ga (e) 

(/o, gf) is admissible [a + v], and 

f == t'fa, g = i-go [m + p]. 

Since f and g are non negative, it follows that / *= 111 • 1 /o I U — 1 0' I Do 1 
[ill + r], i.e, that there is no loss of generality in assuming that t, fa , and go are 
non negative. The relation f + g^lly+r] implies that i‘(fo + go) = 1 
[ill + v]; the fact that (fo, go) is admissible [p + p] then yields t e T~^(T). The 
proof is completed by comparing this result with the e.\pressionB for f and g in 
terms of fo and go and applying Corollary 3. 

8. Pairwise sufficiency versus sufficiency. In order to show that our results 
on pairwise sufficiency (in the preceding section and in the sequel) are not 
vacuous, we proceed now to exhibit a statistic which is, for a suitable set of 
measures, pairwise sufficient but not sufficient. 

Let X = {(a:, f): 0 ^ a; ^ 1, i = 0, 1} be the imion of two unit intervals and 
let 7 = [y: 0 ^ y g 1} be a unit interval. In accordance with our basic 
convention, measurabihty in both X and 7 is to be talcen in the sense of Borel. 
The statistic T is defined by T{x, i) = x. 

Write Xo = ((x,0):0 g x g 1) and Xi = ((x, 1);0 g x g 1). Let p be 
(linear) Lebesgue measure on the class S of Borel subsets of X, and define, 
whenever B eS and 0 g a g 1, 

ya{E) = i[M(S n Xo) +• x««Xi(a, 1)]. 

Let V be (linear) Lebesgue measure on the class T of Borel subsets of 7, and 
define, whenever Pel and 0 g a g 1, 

VaiB) = Mp(^) + Xj^W]' 

Clearly = paT'*; we write 3T = (pa : 0 g a g 1 j, 



BlIPnCXErTT STATISTICS 


237 


If 5{y, a) is defmcti to Ijp 1 or 0 according as y = a or a if d'(v al = 
1 - 8{y, a), and if 

Pa(E I y) - 5'(y, a)xg(:y, 0) + S(y, a)xAl/, 1), 
then a straigjitforward computation shows that 

ya(E n T’-'CF)) « f p,(E | y) dp„(y), 

go that Pa(.B ] y) = p^,(E | y) [i-a]. 

It is now easy to verify tfiat T is pairwise sufficient for SDJ. Indeed if a and /3 
are any two different numbers in the closed unit interval, we may write 

p(E 1 y) = S'{y, a)5'(y, /9)x<(y, 0) + [a(j/, a) + s{y, /3)]x*(y, 1). 

Since ly: p{E 1 y) pi pa(,E ] y)} =« {/5) and [y. p{E \ y) pi p^(E | y)} = [a], it 
follows that piE I y) >= pa,(.E ] y) [rj and p{E | y) = | y) [v^]. 

To prove that T is not sufficient for Sffi we observe that p„(Xi | y) = 
S(y, ct)xxi(v, 1) = fi(y, a) and therefore 

1 V) “= 5(y, a) [ra]. 

Suppose that there is a conditional probability function p such that p(ffi | y) = 

I V) [>'«]• Tlien, in particular, 

p{Xi I y) »» 5(y, a) [r„]. 

Since >'«({a)) = ^ > 0, it follows that 

p(Xx 1 a) = 3(a, a) => 1, 

or, changing to a more suggestive notation, that p(Xi ] y) = 1 for all y. We 
have, however, 

’'-({y; P«(Wi ly) = 01) = r«({y: S(y, a) = Oj) 

^ = VaCty; 2/ = i, 

so that Va(,{y: p^^iXi | y) =■ OJ) = This contradiction shows the impossibility 
of the existence of a conditional probability function common to every p m SW. 

This example shows also that, in a sense, sufficiency is more fundamental 
than pairwise sufficiency. If, for instance, we imagine that it is important to a 
statistician that he either estimate a sharply or retrain from estimating it 
altogether, then ho is by no moans ns well off with the observation of y as with 
that of X. 

9. Pairwise sufficiency for dominated sets. We now proceed to show that 
for dominated sets of measures no such example as the one in the preceding 
section exists, or, in other words, that for dominated sets the concepts of pairwise 
sufficiency and sufficiency do coincide. 



238 


PAUL H. HALMOS AND L. J. SAVAGE 


Lemma 9. 
on S, then'-'' 


If T is pairwise sufficient for a set {mo , Mi, Mi) of three measures 


dpo 

d{pa + Ml + Ms) 


ie)‘r\n 


Peoop. According to Corollary 3, 

Since diM, = fidiiu, + Mi) = + Ms), we have fxdih =» + Ms) and 

fidiM = /i/sd(Mo + Ml), so that 

Cfi + /s ~ /i/i)dMo = fxfid(uo + Ml + Ms)> 

If we write djMi = fd(uo + mi + Ms), then it follows that 

(fi + /s — /i/s)/ = /i/s [mo + Ml d* Ms]- 

Since 0 S /i ^ 1 and 0 g /j ^ I, the equation fi + ft — fxfx = 0 is equivalent 
to/i = /s = 0. Since Mo({a;: ft(x) = /a(a;) = Oj) = 0, it follows that / may be 
redefined, if necessary, to be 0 on the set {i:/i(a:) = / 2 (k) = 0 } without affecting 
the relation d/io = /(i(Mo + Mi + Ms), since outside this set / = / 1 / 5 /(/i + /j — /i/a), 
the proof of the lemma 13 complete. 

Lemma 10. If T is pairmse sufficient for a finite set Imo , mi , ■ * • , M*1 of 
measures on S, then dMo/d(2j-o M<) (0 T~\T). 

Proof. For fe = 1 the conclusion is a restatement of the hypothesis; we 
proceed by induction. Given mo , Mi, • • • , M*+i, we write m^ 2j*-i Mi • Then 
dpo/diuo + m) (<) T~\3') by the induction hypothesis and dpo/diuc + m*+i) («) 
r“*(T) by Corollary 3. Lemma 9 may then be applied to (mo , M, Mk+il and 
yields the desired conclusion. 

Lemma 11. J/ (mo , mi , Ms • • •) is a segweace of measures on S such that 
m»(A') < “,for every E in S, ti{E) = pi{E ); and if "K is a measure 
S such that ui « X /or f = 0, 1, 2, • • • , then 

lim fc Pi)/dk = dp/dk [X], 

Proof. Since 0 g d(Xt-oM.)/dX = (dM./dX) < dp/d'k \k], the se¬ 

ries 2"-o (dpi/dk) does indeed converge to a measurable function / [X], Since, 
for every B in S, 


jjdk = LT-ol^dX » Zr-0M((S) = u(E), 

•we have f = dfi/dX [X], as stated. 


In view of Theorem 1, Lemma 9 asserts that if IT is pairwise sufficient for a set SK of 
three elements} then T is sufficient for 2)'l. Lemmas 10 and 12 extend this result to finite 
and countably infinite sets S91 respectively. Since every countable set of measures is 
dominated, the final result. Theorem, 3, contains all these preliminaries as special oases. 



SUFMCIENT STATISTICS 


239 


Lemma 12. If [iMs , > ih , ‘ u a sequence of measures on S such that 

UiiX) < ^^,and if, for every E in S, u{E) = ^.(E), then 

Jim fc Pi) = dpfl/cZp [p], 

If, in addition, T is pairwise sufficient for Die sequence (po t Pi, pa. • • • |, then 
dn/du (e) T~\T). 

Proof. We have, for fc *» 0, 1, 2, • • > , 

rfpo t—0 ^i) ^ 

d(£^-a M.) dp. dp ' 

If we write X =■ p, then the hypotheses of Lemma 11 are satisfied and, con¬ 
sequently, the second factor on the left side converges to 1 [p]; it follows that the 
first factor converges to dpo/dp [p]. The second assertion of the lemma follows 
from Lemma 10. 

Theorem 3. A necessary and sufficient condition that T be sufficient for a 
dominated set ID? of measures onS is that T be pairwise sufficient for SK. 

Proof. The necessity of the condition is obvious, To prove its sufficiency, 
let = (pi, Pa, ■ ■ •} be a countable subset of Sffl which is equivalent to ^ 
(Lemma 7), and let p<i be an arbitrary measure in iSJl. Since the sufficiency or 
pairwise sufficiency of T remams unaltered if some or all of the measures in SD? 
are replaced by positive constant multiples of themselves, we may assume that 
Pi(X) < 00 . If we write, for every E in S, \(E) = 527-1 Pi(.E), then the 
pairwise sufficiency of T and Lemma 12 imply that dpfl/d(p« -t- X) (e) T~\T). 
The relation 

dpo ^ dpo d(po "b dpo / dX \ ^ 

dX d(po + X) dX d(po 4" X) \d(po + h)J 

dpo / _ dpo \~^ 

d(po + X) \ d(po + X)/ 

implies that dp«/dX («) T~'^{T ); an application of Theorem 1 concludes the proof. 

A comparison of Theorems 1 and 2 and Corollary 3 yields immediately the 
following consequence of Theorem 3. 

Corollary 4. A necessary and sufficient condition that the statistic T be 
sufficient for a dominated set of measures on S is that, for any two measures 
P and V in 9P, dp/d(p + r) («) T~^(,T)f or, equivalently, dp [ dr (e) T~\T), 

10. The value of sufficient statistics in statistical methodology. We gather 
from conversations with some able and prominent mathematical statisticians 
that there is doubt and disagreement about just what a sufficient statistic is 
sufficient to do, and in particular about in what sense if any it contains “all the 
information in a sample.” We therefore conclude this paper with a brief 
explanation of a point of view which, while not original with us, has not received 
due publicity. 



240 


PAUL H. HALMOS AND L. J. SAVAGE 


Suppose a statistician § is to be shown an observation x drawn at random 
from some sample space (X, S) on which an unkno\vn measure, of a set SDt of 
possible measures obtains, while for the same observation x another statistician 
0“ is only to be shown the value T(x) of some statistic T sufficient for SUi. It is 
clear that § is as well off as 3”; we shall argue that 3" is also as well off as S. 

Suppose § has decided how to use hia datum, that, in other words, he has 
decided just what he will do (or, in particular, say) in the event of each possible x. 
His program can then be described schematically by saying that he has selected 
some function f (of the points x) which, without serious loss of generality, may 
be supposed to take real values. Now S’s only real concern is for the probability 
distribution of / given m, i.e. for the function of a real variable c, defined by 

<p{c) = fi{{x:f(x) < c}) = liiEic)). 

But 3" can if he wishes achieve exactly the same results as S, in the following way. 
Let him, on learning the value of T{x), select a real number /, with the aid of a 
“random machine” which produces numerical values according to the known 
distribution function defined by 

m = p(E(c) 1 T(x)). 

Then, for any in iW, the probability that 3* will select a value less than c is 

/ p(E(c)\y)dfiT~\v) « ^^iEic)) = vie). 

Thus 3* is at no disadvantage, save for the mechanical one of having to manipu-- 
late a random machine, and he may fairly be said to have as much information 
as S. 

As a matter of fact we know of no practical situation in which W would actually 
go to the trouble of using a random machine. There are some situations in 
which he should m principle do so, but in which practical statisticians have not, 
so far as we know, thought it worth while. If, for example, an outcome consists 
of a sequence of n heads and tails resulting from n spins of a coin the heads 
ratio of which is known to be either one half or one quarter, then a sufficient 
statistic IS the number of heads which occur in the sequence, In basing a 
decision on the outcome of this program both S and, to a still greater extent, 
3" have (according to Wald’s theory of minimum risk) something to gain by 
recourse to a random machine. There are, on the other hand, many technical 
desiderata which sufficient statistics meet exactly without recourse to random 
machines. Thus, as Blackwell has shown,^* if S has an unbiased estimate, R, 
of some parameter, 3" can find a function J?*, defined by R*iy) = e{R \ y), 
which is an unbiased estimate of that parameter, with variance not greater than 

t atofiZ. More generally, if iZ is any estimate with finite mean square deviation 

from a parameter, then it is easy to show with Blackwell’s methods that R* 


nf expectation and unbiased sequential estimation,” Annala 

of Math Slot,, Vol. 18 (1947), pp 106-110, 



SUFFICIENT STATISTICS 


241 


has no larger a mean square deviation than R. Finally it is a well known fact 
that, under suitable hypotheses, if there exists a maximum likelihood estimate R 
of some parameter, then R depends only on y. 

We think that confusion has from time to time been thrown on the subject 
by (a) the unfortunate use of the term "sufficient estimate,” (b) the undue 
emphofflB on the factorability of sufficient statistics, and (c) the assumption 
that a sufficient statistic contains all the information in only the technical 
sense of "information” as measured by variance. 



ON DESIGNING SINGLE SAMPLING INSPECTION PLANS 
By Fhank E. GaxJBBS 

Ballistic Research Laboratories, Aberdeen Proving Ground, Md. 

1. Summary. In designing single sampling inspection plans, a problem is to 
find the acceptance number, c, and the smallest sample size, n, such that if the 
fraction defective of the material inspected is equal to an acceptable value, pi , a 
large percentage, say, 95% of such lots vdll be accepted under the sample criteria, 
whereas if the fraction defective of the material inspected is objectionable and 
equal to p 2 (where pi < pi), then a large percentage, say, 90% of such lots will 
be rejected. A solution to this problem for the case where the lot size is large 
compared to the sample size is given in this paper and tables are provided for 
quick determination of the sample size n and acceptance number c. 

2. Introduction. In sampling inspection of material one practice is to set an 
acceptable quality level == pi, say, such that the consumer desires to accept 
practically all-—95% or more—of lota of fraction defective pi or less (and hence 
desires to reject at most a maximum of about 6% of lots which are of quality 
Pi or better) and to set also an objectionable fraction defective , say, which 
represents quality so poor that the consumer cannot afford to accept more than 
about 10% or less of lots of this quality or poorer,* From the standpoint of the 
producer, he should have very few rejections, 6% or less, for his submitted lots 
the fractions defective of which are equal to or better (less) than pi, whereas 
he should be willing and also expect to suffer increasingly more rejections if his 
process average percent defective departs from the acceptable quality level pi 
toward poor or objectionable quality. In this connection, if we are given pi an 
acceptable quality level, pj an objectionable percent defective, the risk a B% 
of rejecting a lot of fraction defective pi , and the risk => 10% of accepting a 
lot of the objectionable fraction defective pi, a problem of importance in single 
sampling inspection is to find the smallest sample size n and the acceptance 
number c which will approximate closely the protection stated above. Due to 
the discrete nature of n and c, it is not usually possible to find n and c such that 
precisely the above protection is guaranteed; however, it is possible to pick that 
single sampling plan which, for all practical purposes, gives the desired protection, 
i.e. it is possible to select that single sampling plan which more nearly satisfies 

* this paper was first presented for publioation, the percent defectives pi and pt 
were labeled “Acceptable Quality Level” and “Lot Tolerance Percent Defective,” respec¬ 
tively. In view of the suggestions of H. Q. Romig and H. F. Dodge, strict reference to 
these particular terms have been avoided in order that the percent defectives pi and p» 
would ap^ar in a more generalized form, Thie recommendation is considered especially 
desiraWe m view of the fact that Table I and Table II of the paper are percentage points 
of the Binomial Distribution end hence are useful in problems other than that of designing 
single sampling inspection plans. 


242 



DESIGNING SAMPLING PLANS 


243 


the above protection requirements than any other plan. The values of n and c 
can be found simply by looking for an entry in Table I below which is close to 
pj and an entry in Table II close to pj such that column heading c and row 
headmg n in Table I correspond exactly with the respective column and row 
headings in Table II. For the sample sizes n, acceptance numbers c and quality 
levels p covered in Tables I and II, the above procedure makes unnecessary any 
computation of or any approximation to the sample size and acceptance number. 
It will bo noticed, however, that usually the proper choice of c is clear whereas 
some slight judgment may be necessary in selecting n. 

It is remarked also that Tables I and II solve the equivalent problem of 
finding n and c in connection with testing the hypothesis Ho that the fraction 
defective of the Binomial population sampled is pi or less as against an alternative 
hypothesis Hi which states that the fraction defective of the lot, population, 
process, etc., sampled is pj or greater (pj > pi), where a = .05 is the maximum 
risk of erroneously rejecting Ho when it is true and )9 = .10 is the maximum risk 
of erroneously accepting Ho when the alternative Hi is true. 

The solution to the problem of finding an appropriate single sampling plan 
in this paper is given by solving the infinite case, i.e. by assuming the lot to be an 
infinite Binomial population. In practice lots are of finite size. However, 
it is well known that Binomial probabilities (infinite universe) give excellent 
practical approximations to Hypergeometric probabilities (finite lot) provided 
the sample size is only a small percentage of the lot size. Hence, the reader is 
warned in using the tables for sampling inspection problems that the lot size 
should be at least 10 or 16 times the sample size. 

3. Basis for construction of Table I and Table H. It is well known that if 
P(c, n, p) represents the probability of obtaining c or less defectives in a random 
sample of size n from a Binomial Population of fraction defective p, then the 
relation between P(c, n, p) and the Incomplete Beta Function Ratio is given by 

(1) p(c,„,p)=Ji_>-c.o+l )»^^^^—- xY dx. 

Consequently, using a table of percentage points for the Incomplete Beta 
Function (1), values of pi can be found for Table I such that 

P(c, n, pi) =» .96, 

and values of pi can be found for Table 11 presented at the end such that 

P(c, n, pi) = .10. 

Also, Table I and Table II can be computed by using percentage points of the 
P-distribution (2). Upon making the transformation 

2(71 — c) 

2(n - c) + 2(c + 1)P 


X 



244 


FRANK E. GRUBBS 


in (1) above to the ^^-distribution, we obtain easily that 

(2) P{c, n. v) = f -f 1)]' [2(n ~ c )]'*-r ■ 

P[0 + 1, n C) J(n-o)3)/(iH-l)g 

[2(n - c) + 2(c + dP, 

•where g = 1 — p. 

With the aid of a table of percentage points of the ^’-distribution (2), we may 
determine for various combinations of n — c and c + 1 those values of p such that 


P(c, n, pi) = .95 for Table 1; 


and 


P(c, n, Pj) = .10 for Table II. 
In fact, if P(fi, n, p) = a, then 

or 


= (c+ l)n{2(c 4- 1), 2(n - c)) 

^ ^ - c) + (c + l)Pj2(c + 1), 2(n - c)) ’ 

for which relation values of pi for a = .96 are given in Table I below and values of 
Pj for a = .10 are given m Table II. 

Although the 95% points are not given directly in (2), they are easily obtain¬ 
able from the relation 


P.sjfvi, Vj) 


1 

P.06(vj , Vl) ' 


Interpolation was required for the great majority of the entries in Tables I 
and II. The values given were obtained by harmonic or linear interpolation 
using References [1] and [2] and are believed accurate to "within one unit in the 
last place. 

It will be noticed that if the chosen acceptable quality level, p;, is greater 
than the appropriate tabulated value in Table I for the single sampling plan 
(n, c), then the operating characteristic curve will pass below the point (pi, .96). 
That is, the risk of rejection under the sampling plan for lots of fraction defective 
Pi will be somewhat more than 5%. On the other hand, if a selected acceptable 
quality level pi is less than the appropriate entry in Table I, the risk of rejection 
for a product of fraction defective pi wiU be less than 6%, Similar considerations 
apply also to the fractions defective, pa, in Table II. 


4. Single sampling plans based on the Poisson approximatioii to the binomial. 
Tables I and II are useful for determination of a single sampling plan when the 



DESIQNINQ BAMPLING PLANS 


245 


desired percent defectives are listed and n does not exceed 150. Table III 
is particularly useful in designing a single sampling plan when we are interested 
in fractions defective not greater than about .10. A somewhat similar procedure 
has already been suggested by Peach and Littauer f3]. If we designate by 
P(c, a) the sum of individual Poisson probabilities, 


P(c, a) 


iro'mi ’ 


then Table IH gives values ai »= npi of a for which 

P(c, oi) = .95 

and values Oj =« npi of a for which 

P(cj Oi) = >10. 

Hence, to find the single sampling plan whose operating characteristic curve 
passes nearly through the points (pi, .95) and (p*, .10) one merely divides 
values of oi in Table III for various values of c by the acceptable quality level pi 
and divides values of a* in Table III by the objectionable percent defective pj. 
Then the acceptance number c is picked for which oi/pi most nearly equals oa/pn 
and the approximate sample size n may be determined by rounding to an integer 
the average of the two approximately equal numbers Oi/pi and oj/pj. 


6. Example on the use of Tables I, II, HI. Given an acceptable percent 
defective or quality level of ,01 and an objectionable quality level of .10, it is 
desired to find the single sampling plan which will accept 95% of product which is 
of quality pi =• .01 and which will reject 90% (or accept only 10%) of product of 
quality pi = .10. Looking in Table I for entries pi which approximately equal 
.01 and in Table II for entries pi which approximately equal .10 such that the 
c and n of Tables I and 11 correspond, we see that c must be equal to 1 whereas n 
may take possibly any one of the values 35, 36, 37, 38. In this connection, Ave 
have to set up some criteria for the choice of n. Although any of several criteria 
may be used, a reasonable criterion appears to involve picking n such that the 
sum of the absolute departures of the Operating Characteristic Curve from the 
risks a = .05 at pi and /3 = .10 at ps is a minimum. This may be determined by 
using appropriate tables of Binomial Probabilities or by computing at pi and pi 
the chance of obtaming c or loss defectives in n for the various possible combina¬ 
tions of c and n. If the above criterion were applied to the present example, 
the combination c = 1 and n = 37 would bo selected, i.e, the single sampling 
plan would be c =• 1, w « 37, For this sampling plan, the probability of passing 
at Pi == .01 is .9471 and the probability of passing at pa = .10 is .1036. For 
the sake of expediency, another proposal would be merely to select somewhat of a 
“middle” value of n especially when the variation in sample size is slight. 

If we use Table HI for the above example, we can select n and c with the aid 



246 


ruANK B. GRUBBS 


of the following simple tabulation: 


n 


c 



0 

1 

2 

3 

Ol/pl . 

■M 

35.6 

81.8 

MBM 

ai/p^ . 

m 

38.9 

53.2 



Since the sample sizes “cross” at c = 1, we would select c = 1 and n = 1/2 
(35.5 + 38.9) = 37.2 or n = 37. 

A use of Table I of some practical importance is in determining at a glance 
those values of p for which the probability of obtaining c or less defectives in a 
sample of n is equal to .95. As a matter of fact, a series of tables similar to 
Table I and Table 11 for which P{c, n, p) ~ .99, .95, .90, .10, .05, .01 etc. would 
be of considerable practical use. 

Acknowledgment. The author is indebted to Miss Helen J, Coon for carrying 
out the computations for the tables. 













TABLE I 

Valma of p px such that P(c, n, pi) = ,95 



c 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

n 

1 

.0500 










1 

2 

,0253 

.224 









2 

3 

.0170 

.135 

.368 








3 

4 

.0127 

.0970 

.249 

.473 







4 

6 

.0102 

.0764 

.189 

.343 

.549 






6 

6 

.00851 

.0628 

.153 

.271 

.418 

.607 





6 

7 

.00730 

.0534 

.129 

.225 

.341 

.479 

.652 




7 

8 

.00639 

.0464 

.111 

.193 

.289 

.400 

.629 

.688 



8 

9 

.00568 

.0410 

.0978 

.169 

.251 

.345 

.450 

.571 

.717 


9 

10 

.00512 

.0368 

.0873 

.150 

.222 

.304 

.393 

.493 

.606 

.741 

10 

11 

.00465 


.0788 

.135 

.200 

.271 

.350 

.436 

.630 

.630 

11 

12 

.00427 


.0719 

.123 

.181 

,245 

.315 

.391 

.473 

.562 

12 

13 

.00394 

,0281 

.0660 

.113 

.166 

.224 

.287 

.355 

.427 

.505 

13 

14 

.00366 

.0260 

.0611 

.104 

.153 

.206 

.264 

.326 

.390 

.460 

14 

15 

.00341 

.0242 

.0568 

.0967 

.142 

.191 

.244 

.300 

.360 

.423 

15 

16 

.00320 

E SI 

.0531 

.0903 

.132 

.178 

.227 

.279 

.333 

.391 

16 

17 

.00301 


.0499 

.0846 

.124 

.166 

.212 

.260 

.311 

.364 

17 

18 

.00285 


.0470 

.0797 

.118 

.166 

.199 

.244 

.291 

.341 

18 

19 

.00270 

.0190 

.0445 

.0763 

.110 

.147 

.188 

.230 

.274 

.320 

19 

20 

.00250 

.0181 

.0422 

.0714 

.104 

.140 

.177 

.217 

.259 

.302 

20 

21 

.00244 

,0172 

.0401 

.0678 


.132 

.168 

.206 

.245 

.286 

21 

22 

.00233 

.0184 

.0382 

.0646 

.0941 

.126 

.160 

.196 

.233 

.271 

22 

23 

.00223 

.0157 

.0365 

.0617 

.0898 

.120 

.152 

.186 

.222 

.258 

23 

24 

.00213 

.0160 

.0350 

.0590 

.0859 

,115 

.146 

.178 

.212 

.246 

24 

25 

.00205 

.0144 

.0335 

.0566 


.110 

.139 

.170 

.202 

.236 

25 

26 

.00197 

.0138 

.0322 

.0643 


.106 

.134 

.163 

.194 

.226 

26 

27 

.00190 

.0133 

.0310 

.0622 

.0759 

.101 

.129 

.167 

.186 

.217 

27 

28 


.0128 

.0298 

,0603 

.0731 


,124 

.151 

.179 

.208 

28 

29 

.00177 

.0124 

.0288 

.0485 



.119 

.146 

.172 

.200 

29 

30 


.0120 

.0278 

.0469 

,0081 


.115 

.140 

.167 

.193 

30 

31 


.0116 

.0269 

.0463 

.0668 


.111 

.136 

.161 

.187 

31 

32 


.0112 

.0260 

.0438 

.0637 


.107 

.131 

.155 

.180 

32 

33 


.0109 

.0252 

.0425 

.0617 


.104 

.127 

.150 

.175 

33 

34 


.0106 

.0245 

.0412 

.0598 


.101 

.123 

.146 

.169 

34 

36 

.00146 

.0102 

.0238 

.0400 

.0580 


.0978 

.119 

.141 

.164 

35 


247 



























TABLE I —Continued 







c 









1 

= ' 

3 

4 

5 

e 

7 

8 

0 


36 



R ^Hj 



.0752 

.0950 

.116 

.137 

.159 

30 

37 

liftninS 



BnaS 

.0548 

.0731 

0923 

.112 

.133 

.155 

37 

38 

.00135 




.0533 

.0711 

.0898 


. 130 

.150 

38 

39 



luSil 


SMU] 

.0692 


.106 

.126 

.146 

39 

40 





Vm 



.104 


.142 

40 

41 

.00125 



IIH 


.0657 

.0830 

1 1 || 


.139 

41 

42 

nijj^ 





.0641 

ngngi 

B|^ 

.117 

,135 

42 

43 

nSjiel 





.0826 

iwCTi 

BSn 

.114 

.132 

43 

44 

.00117 

.00814 


lilBa 


.0611 

.0771 

Big 

.111 

.129 

44 

45 





.0448 

.0597 


iQ 


.126 

45 

46 






.0584 


ffil 


.123 

4G 

47 






.0571 

.0720 

Bilwt! 

,104 

.120 

47 

48 






.0559 


.0857 

SE9 

.118 

48 

49 






.0547 

BfffiBi 

nmg 


.115 

49 

50 




B| 

Q| 

.0536 

0676 


.0972 

.113 

50 

51 




B 1 l^H 

R M| 

.0525 



1!!! 

.110 

61 

52 




B' Q 

B^ 

.0515 

IQ ^ 

.0789 

1 !> : 

.108 

62 

53 





B^ 


IQS 


|n 1 

.106 

63 

54 




B "Si 

B^ 

.0495 


1^^ 

1* i! 

.104 

54 

55 


mm 



Bg 

.0486 


R 

Els 

.102 

65 

56 






.0477 



.0865 

.100 

66 

57 






.0468 


.0718 

.0849 

.0984 

67 

58 








.0705 

.0834 

.0966 

58 

59 


PIlBlIt! 



BS * 

.0452 

.0570 


.0820 

.0949 

59 

60 


.00595 



0334 

.0445 

.0561 


.0806 

.0933 

60 

61 





ii 

m 

m 


.0792 

.0917 

61 

62 





B 

IB 

IB 


.0779 

.0902 

62 

63 





ES 

.0423 

.0533 

ffififi: 

.0766 

.0887 

63 

64 





.0313 

mm 


.0637 

.0764 

.0873 

64 

65 


|B 



H 

.0410 



.0742 

.0869 

05 

66 






IHR 

DSiu!& 


B! 

.0730 

.0840 

66 

67 



.0123 



.0397 

.0501 


.0719 

.0833 

67 

68 



mi 


0294 

EHns 

\mm 


.0708 

.0820 

68 

69 



n 


B 

.0885 

.0486 


.0698 

.0808 

69 

70 


BH 

B 


B 

.038C 

.0479 

mi 

.0687 

.0796 

70 


248 































































































TABLE I —Conlinued 







C 








0 

1 

2 

3 

i 

5 

6 

7 

8 

9 

n 

71 


.00503 

,0116 

.0195 

0282 

.0374 

.0472 

.0573 

.0678 

.0785 

71 

72 


.00496 

.0115 

.0192 

.0278 

.0369 

.0465 

.0565 

.0668 

.0773 

72 

73 


mm 

.0113 

.0189 

.0274 

.0364 

.0459 

0557 

.0658 

.0762 

73 

74 

■ixi ^ 

.00482 

.0111 

.0187 

.0270 

.0359 

.0452 

,0549 

.0649 

.0752 

74 

75 

mS 


.0110 

.0184 

.0266 

.0354 

.0446 

.0642 

.0641 

.0742 

75 

76 

.000675 

.00470 

.0108 

,0182 

.0263 

.0349 

.0440 

.0535 

,0632 

.0732 

76 

77 

.000666 


.0107 

.0179 

.0259 

.0345 

.0434 

.0528 

.0623 

.0722 

77 

78 

.000057 


.0106 

.0177 

.0256 

.0340 

.0429 

.0521 

.0615 

.0712 

78 

79 

.000649 


.0104 

.0175 

.0263 

.0336 

.0423 

.0514 

.0607 

.0703 

79 

80 

.000641 

Eh 

.0103 

.0173 

.0249 

.0332 

.0418 

.0507 

.0600 

.0694 

80 

81 

,000633 

.00440 

.0102 

.0170 

.0246 

.0328 

.0413 

.0501 

,0592 

.0685 

81 

82 

.000625 

.00435 

.0100 

.0168 

.0243 

.0323 

.0408 

.0495 

.0585 

,0677 

82 

83 

.000618 

.00430 

.00992 

.0166 

.0240 

.0319 

.0403 

.0489 

.0577 

.0668 

83 

84 

.000010 

.00425 

.00980 

.0164 

.0237 

.0316 

.0398 

0483 

.0570 

.0660 

84 

85 

.000603 

.00420 

.00969 

.0162 

.0235 

.0312 

.0393 

.0477 

.0564 

.0652 

85 

80 

.000590 

.00416 

.00967 

,0160 

.0232 

.0308 

.0388 

.0471 

.0667 

,0646 

86 

87 

.000589 

.00410 

.00946 

.0169 

.0229 

.0305 

.0384 

.0466 

.0550 

.0637 

87 

88 

.000583 

.00405 

.00936 

.0167 

.0227 

.0301 

.0379 

.0460 

.0544 

.0630 

88 

89 

,000676 

.00401 

.00925 

.0165 

.0224 

.0298 

.0376 

0456 

.0638 

.0622 

89 

90 

.000570 

.00396 

.00916 

.0163 

.0221 

.0294 

,0371 

.0450 

.0532 

.0615 

90 

91 

.000564 

.00392 

.00904 

.0162 

.0219 

.0291 

.0367 

,0445 

0526 

.0608 

91 

92 

.000557 

.00388 

.00895 

.0160 

.0217 

.0288 

.0363 

.0440 

.0520 

.0602 

921 

93 

.000551 

.00383 

.00886 

.0148 

.0214 

.0285 

.0359 

.0435 

.0514 

.0596 

93 » 

94 

.000646 

.00379 

.00875 

.0147 

.0212 

.0282 

.0355 

.0431 

.0509 

0589 

94 

95 

.000640 

.00375 

.00866 

.0145 

.0210 

.0279 

.0351 

.0426 

.0503 

.0582 

95 

96 

.000534 

.00371 

.00857 

.0144 

.0207 

.0276 

.0347 

.0421 

.0498 

.0576 

96 

97 

.000629 

.00368 

.00848 

.0142 

.0205 

.0273 

.0344 

.0417 

.0493 

.0570 

97 

98 

.000523 

.00364 

.00840 

.0141 

.0203 

.0270 

.0340 

.0413 

,0487 

.0664 

98 

99 

mm 

.00360 

.00831 

.0139 

.0201 

.0267 

,0337 

.0408 

.0482 

.0568 

99 

100 

.000613 

.00367 

.00823 

.0138 

.0199 

.0266 

.0333 

.0404 

.0478 

.0663 

100 

101 

.000508 

.00363 

.00814 

.0136 

.0197 

.0262 

.0330 

.0400 

.0473 

.0547 

101 

102 

.000603 

.00360 

.00806 

.0136 

.0196 

.0259 

.0327 

.0396 

.0468 

.0542 

]02 

103 

.000498 

.00346 

.00799 

.0134 

.0193 

.0257 

.0323 

.0392 

.0463 

.0536 

103 

104 

.000493 

.00343 

.00791 

.0132 

0191 

.0254 

.0320 

.0389 

.0459 

.0531 

104 

106 

.000488 

.00339 

.00783 

.0131 

.0189 

,0252 

.0317j.0385 

.0454 

.0526 

106 


249 








TABLE I —Continued 



c 



0 

1 

2 

3 

4 

6 

6 

7 

8 

e 


lOG 

.000484 

.00336 

.00776 

.0130 

.0188 

0249 

0314 

M 1 S 

.0450 

.0521 

106 

107 

.000479 

.00333 

.00768 

0129 

.0186 

.0247 

.0311 

jn: K 


.0516 

107 

108 

.000475 

00330 

00761 

.0127 

.0184 

.0245 

.0308 

jn: 1 

.0442 

.0511 

108 

109 

.000470 

.00327 

00754 

.0126 

.0182 

.0242 

.0305 

jn' » 

.0438 

.0506 

109 

no 

000466 

.00324 

.00747 

.0125 

.0181 

.0240 

.0302 

ill 

.0433 

.0502 

110 

111 

.000462 

.00321 

.00741 

.0124 

.0179 

.0238 

.0300 


.0430 

.0497 

111 

112 

.000458 

.00318 

.00734 

.0123 

.0178 

.0236 

.0297 

.0360 


.0492 

112 

113 

.000454 

.00315 

00727 

.0122 

.0170 

.0234 

.0294 

.0357 

.0422 

.0488 

113 

114 

.000450 

.00313 

.00721 

.0121 

.0174 

.0232 

.0292 

.0354 

.0418 

.0484 

114 

115 

.000446 

.00310 

.00715 

.0120 

.0173 

.0230 

.0289 

.0351 

.0414 

.0479 

116 

116 

.000442 

.00307 

.00709 

.0119 

.0171 

.0228 

.0287 

.0348 

.0411 

.0476 

116 

117 

.000438 

.00305 

.00702 

.0118 

.0170 

.0226 

.0284 

.0345 

.0407 

.0471 

117 

118 

.000435 

.00302 

.00696 

.0117 

.0168 

.0224 

,0282 

.0342 


.0467 

118 

119 

.000431 

.00299 

.00691 

.0116 

.0167 

.0222 

. 02'^9 

.0339 


.0463 

119 

120 

.000427 

.00297 

.00685 

.0115 

.0166 

.0220 

.0277 

.0336 

.0397 

.0469 

120 

121 

.000424 

.00294 

.00679 

.0114 

.0164 

.0218 

.0275 

.0333 

.0394 

.0455 

121 

122 

.000420 

.00292 

.00674 

0113 

.0163 

.0216 

.0272 

.0330 


.0451 

122 

123 

.000417 

.00290 

.00668 

.0112 

.0162 

.0215 

.0270 

.0328 

.0387 

.0448 

123 

124 

.000414 

.00287 

00663 

.0111 

.0160 

.0213 

.0268 

.0325 

.0384 

.0444 

124 

125 

.000410 

.00285 

.00657 

.0110 

.0159 

.0211 

0266 

.0322 


.0440 

125 

126 

000407 

.00283 

.00652 

.0109 

.0158 

0209 

.0264 

.0320 

.0378 

.0437 

126 

127 

.000404 

.00281 

.00647 

.0108 

.0156 

.0208 

.0202 

.0317 

.0375 

.0433 

127 

128 

000401 

.00278 

.00642 

.0107 

.0155 

0206 

.0259 

.0316 

.0372 

.0430 

128 

129 

.000398 

.00276 

.00637 

.0107 

.0164 

.0204 

.0257 

.0312 


.0427 

129 

130 

.000394 

.00274 

.00632 

.0106 

.0153 

.0203 

.0255 

.0310 


.0423 

130 

131 

000391 

.00272 

.00627 

.0105 

.0152 

.0201 

.0253 

.0308 

.0303 

.0420 

131 

132 

.000389 

.00270 

.00622 

.0104 

.0160 

.0200 

.0252 

.0305 


.0417 

132 

133 

.000386 

.00268 

.00618 

.0103 

.0149 

.0198 

.0260 

.0303 

.0358 

.0414 

133 

134 

.000383 

.00266 

.00613 

.0103 

.0148 

.0197 

.0248 

.0301 


.0410 

134 

135 

.000380 

.00264 

.00608 

.0102 

.0147 

.0195 

.0246 

.0298 


.0407 

136 

136 

.000377 

.00262 

.00604 

.0101 

.0146 

.0194 

.0244 

.0296 


.0404 

136 

137 

.000374 

.00260 

.00599 

.0100 

.0145 

.0192 

.0242 

.0294 

.0347 

.0401 

137 

138 

.000372 

.00258 

.00595 

.00996 

.0144 

.0191 

.0240 

.0292 

.0344 

.0398 

138 

139 

.000369 

.00256 

.00591 

.00989 

.0143 

.0190 

.0239 

.0290 

.0342 

.0395 

139 

140 

.000366 

.00254 

.00587 

.00982 

.0142 

.0188 

.0237 

.0288 


.0393 

140 


250 


































TABLE II 


Vahies of p — p 2 such that P{c, n, ps) = 10 



C 1 

n 

71 

0 

1 

2 

3 

4 

5 

0 

7 

3 

9 

1 

,900 










1 

2 

,684 

.949 









2 

3 

.636 

.804 

.965 








3 

4 

,438 

.680 

.857 

.974 







4 

5 

.369 

.684 

753 

.888 

.979 






5 

6 

.319 

.510 

667 

.799 

.907 

.983 





6 

7 

280 

.453 

.596 

.721 

.830 

.921 

.985 




7 

8 

.260 

.406 

.638 

.655 

.760 

.853 

.931 

987 



8 

9 

226 

.368 

.490 

.599 

.699 

.790 

.871 

.939 

.988 


9 

10 

.206 

337 

.450 

.552 

.646 

.733 

.812 

.884 

.945 

990 

10 

11 

189 

310 

.415 

.511 

.599 

.682 

.759 

.831 

.895 

.951 

11 

12 

175 

.288 

386 

.475 

.559 

.638 

.712 

.781 

.846 

904 

12 

13 

.162 

268 

.360 

.444 

.523 

.598 

.669 

.736 

.799 

,858 

IS 

14 

.152 

.251 

.337 

.417 

.492 

.563 

631 

.695 

.757 

.815 

14 

15 

.142 

,236 

317 

.393 

.464 

.532 

.596 

.658 

.718 

.774 

15 

16 

134 

.222 

300 

.371 

.439 

.504 

.566 

.626 

.682 

.737 

16 

17 

.127 

.210 

.284 

.352 

.416 

478 

.537 

.594 

650 

.703 

17 

18 

.120 

.199 

.269 

.334 

396 

.455 

.512 

.567 

.620 

.071 

18 

19 

.114 

.190 

257 

.319 

.378 

.434 

.489 

.541 

.592 

.642 

19 

20 

.109 

.181 

.245 

.304 

.361 

415 

.467 

.518 

.567 

.615 

20 

21 

104 

.173 

234 

.291 

.345 

.397 

.448 

.497 

.544 

.590 

21 

22 

.0994 

.166 

.224 

.279 

.331 

381 

.430 

.477 

.523 

568 

22 

23 

.0953 

,169 

.215 

.268 

,318 

.366 

.413 

.459 

.503 

.546 

23 

24 

1 .0915 

.163 

.207 

.258 

.306 

.352 

.398 

.442 

.486 

.526 

24 

25 

; 0880 

.147 

.199 

.248 

.295 

.340 

.383 

.426 

.467 

.508 

25 

26 

I .0847 

.142 

.192 

239 

.284 

.328 

.370 

.411 

.451 

.491 

26 

27| .0817 

137 

.185 

.231 

.275 

.317 

.358 

.397 

.436 

.476 

27 

28i 0789 

1.32 

.179 

.223 

.265 

.306 

.340 

.385 

,422 

.459 

28 

29 

1 .0763 

.128 

.173 

.216 

.257 

.297 

.335 

.372 

.409 

.446 

29 

3( 

j .0739 

.124 

.168 

209 

.249 

.288 

.325 

.361 

397 

.432 

30 

31 

; 0716 

.120 

.163 

,203 

.241 

.279 

.315 

.350 

.385 

.419 

31 

32 

0694 

.116 

.158 

,197 

.234 

271 

.306 

.340 

.374 

.407 

32 

3S 

.0674 

.113 

.153 

.191 

.228 

203 

297 

.331 

.364 

.39C 

33 

34 

0651 

110 

.149 

.186 

.221 

.256 

.289 

322 

.354 

.38E 

34 

31 

0637 

.107 

.145 

.181 

216 

.249 

,282 

.313 

.345 

.37E 

35 


262 





TABLE ll—Conlinuiid 


n 


1 i 

2 '1 

1 

3 ' 

• 

c 

4 

5 i 

6 

7 1 

8 

9 

n 

36 

.0620 

.104 j 

.141 

.176 

.210 i 

.2«i2 i 

,274 

.305 

.336 

.366 

36 

37 

.0003 

.101 1 

.138 ■ 

.172 

.205 

.236 1 

.267 

.298 1 

.327 

.357 

37 

38 

.0588 

.09851 

.1.34 

.167 

.199 

.230 ' 

.261 

,290 

.319 

.348 

38 

39 

. 0573 

.0961 

.131 

.163 

.195 

.22,5 ' 

.251 

.283 

.312 

.340 

39 

40 

.0,559 

.0938 

.128 

.1.59 

.190 

.220 

.2(8 

.277 

.305 

.332 

40 

41 

.0540 

.0916 

.125 

. 1.56 

.180 

.215 

.242 

.270 

.298 

.324 

41 

42 

.0533 

.0895 

.122 

.152 

.181 

.210 

237 

,264 

.291 

.317 

42 

43 

.0521 

.0875 

.119 

.149 

.177 

.205 

.232 

.259 

.285 

.310 

43 

44 

.0510 

.0856 

.116 

.146 

.174 

.201 

.227 

.253 

.279 

304 

44 

45 

.0499 

.0837 

.114 

.142 

.170 

.190 

.222 

.248 

.273 

.297 

45 

46 

.0488 

.0819 

.112 

.140 

.160 

.192 

.218 

.243 

.268 

.291 

46 

47 

.0478 

.0803 

.109 

.137 

.103 

.188 

.213 

.238 

.262 

.285 

47 

48 

.0468 

.0786 

.107 

.134 

.160 

.185 

.209 

.233 

.257 

.280 

48 

49 

.0459 

.0771 

.105 

.131 

.157 

.181 

.205 

.229 

.252 

.274 

49 

50 

.0450 

.0756 

.103 

.130 

.1.54 

.178 

.201 

.224 

.248 

.269 

60 

51 

,0441 

.0741 

.101 

.126 

.151 

.174 

.197 

.220 

.243 

.264 

51 

62 

.0433 

.0728 

.0991 

.124 

.148 

,171 

.104 

.210 

.239 

.259 

52 

53 

.0425 

.0714 

.0973 

.122 

.145 

,168 

.190 

.212 

.235 

.255 

53 

54 

.0417 

.0701 

.0956 

.120 

.143 

.166 

.187 

.208 

.230 

.250 

54 

56 

.0410 

.0689 

.0939 

.117 

.140 

.162 

.184 

.205 

.227 

,246 

55 

56 

.0403 

.0677 

,0923 

.115 

.138 

.159 

.180 

.201 

.223 

.242 

56 

57 

.0396 

.0666 

.0907 

.113 

.136 

.167 

.177 

.198 

.219 

.238 

57 

58 

.0389 

.0664 

.0892 

,112 

,133 

.164 

.176 

.196 

.216 

.234 

58 

5S 

.0383 

.0643 

.0877 

.110 

.131 

.162 

.172 

.191 

.212 

.230 

59 

60 

.0376 

.0633 

.0863 

.108 

.129 

.149 

.169 

.188 

209 

,220 

60 

61 

.0370 

.0623 

.0849 

.106 

.127 

.147 

.166 

.185 

.206 

223 

61 

62 

.0366 

.0613 

.0836 

.106 

.126 

.146 

.164 

.183 

.203 

.219 

62 

63 

.0359 

.0603 

.0823 

.103 

.123 

.142 

.161 

.180 

.200 

.216 

63 

64 

.0353 

.0694 

.0810 

.101 

.121 

.140 

.169 

.177 

.197 

213 

64 

6E 

.0348 

.0686 

.0798 

.0999 

.119 

.138 

.166 

.174 

.194 

.210 

65 

66 

.0343 

.0577 

.0786 

.0984 

.117 

.136 

.164 

.172 

.191 

207 

66 

67 

.0338 

.0668 

.0775 

.0970 

.116 

.134 

.152 

.169 

.188 

.204 

67 

68 

.0333 

.0660 

.0764 

.0966 

.114 

.132 

.150 

.167 

.185 

.201 

68 

69 

.0328 

.0562 

.0753 

.0943 

.113 

.130 

.148 

.165 

.182 

198 

69 

70 

.0324 

.0644 

.0743 

.0930 

• 111 

.128 

.146 

.162 

.179 

.1951 70 


263 



TABLE II —Continued 



254 
















































































TABLE ll~Continued 







G 








0 

1 

2 

3 

4 

s 

6 

7 

8 



■ ||9 

Rll 

R^ 

RiQ 




B ! s 

.109 

.120 

.131 

106 

■ m 





.0733 

R!^ 

Ri t* 

.108 

.119 


107 

■ s 

R3 





.0842 

|R til 

.107 

.118 

.128 

108 

■ m 

R^ 


Rff:f 



.0834 

Rn! 

.106 

.116 

.127 

109 

11 

Rg 


.0477 

.0597 

UM 

.0827 

R|| 

.105 

.115 


no 

111 


.0346 




: R 

S 1 

.104 

.114 

.125 

111 

112 


Ri : S 

.0468 

.0587 

.0701 

Ril 

R'' i 

.103 

.113 

.124 

112 

113 

■nKw 

R:'! 


.0582 

Rm j 

R:!' 

R|l 1 

.102 

.112 

.123 

113 

114 


R!! 1 

.0460 


R|' 

R jS: 

R'i 

.101 

,111 

.122 

114 

115 

.0198 

Hjj 

.0456 

.0672 

R|| 

Hfi 

Hill 


.111 

.121 

115 

116 

|R 

.0331 

.0452 

.0567 


.0786 

.0890 

.0994 

HB 

.120 

116 

117 

R f • 

Ri ^ 

.0449 

.0662 

R R 

.0778 

.0883 

.0986 

RobI 

.119 

117 

118 

R 

R:^ 

.0445 

.0567 

RQ 

mm 

.0875 

.0977 

.108 

.118 

118 

119 

R*' 

R:$ 

.0441 

.0553 

RQ 


.0868 

.0969 

,107 

.117 

119 



R|| 

.0437 

.0648 

Rg 

m 

,0861 


,106 

.116 

120 

121 

HI! 

Hi 

.0434 



.0763 

.0864 

^R R1 

m 

,115 

121 

122 

R K! 



I^Q 

1^1 



R(^ 

HQI 

114 

122 

123 

R K 1 

IQ C 

.0427 


1^^ 

.0741 

.0841 

Ri': 


.113 

123 

124 

Rt : 

IQ n 

.0424 

R^ 

I^Q 



R;: 1 

.103 

.112 

124 

126 

R|| 

H| 


.0527 

|B 

|B 

|B 

B| 

.102 

■ 111 

125 

126 

Hi 

R ; 

.0417 

.0523 



^R SI 


.101 

,110 

126 


Im ^ 

R'!: 

.0414 

.0619 

.0620 

I^Q 

R:' i 

RQ 


.110 

127 

128 

■Q V 

R'iil 


.0516 


I^E 

R:|: 

RQ 


,109 

128 

H[bB 


R^ ? 

.0407 

.0611 

.0610 

IRI 

R:i! 

RR 


108 

129 

li 

H| 

R|| 

Rgg 

.0507 

.0606 

|B 

Hfi 

B 


.107 

130 


1 9 



.0503 

1R 

.0696 

,0790 

,0882 

HR 

.106 

131 


R 1 g 

l^^n 

.0398 

.0499 

R 

.0691 


.0876 

RR 

.105 

132 

133 

R' K 

l^f 


.0496 

Riffi 


RR 

.0869 

RiHi 

.105 

133 

134 

R! R 

RQI 

.0392 


Rl$ 

IQQ 


.0863 

.0952 

.104 

134 

136 

R|| 

Rgj 

.0389 


Hitt 

|B 

|H 

.0867 

.0945 

.103 

135 

H 

HI 


.0387 

.0486 

.0679 


.0762 



.102 

136 

137 

R 1 $ 

HIhISI 

.0384 

MS 

R m 


.0756 

R||g 

1 

.102 

137 


m 

.0279 

.0381 

mm 

RQ 



R 


.101 

138 

■eS 

K m 



mm 

R Q 


.0745 

R:% 


.100 

139 

140 

HB 



B 

Hi 

IB 




,.0996 

140 


26S 





































































































































































250 


FB.^.NK E. GEUB13S 


TABLE II —Concluded 







C 






n 


0 

1 

2 

3 

4 


G 

7 

8 

9 

141 

.0162 

.0273 

.0373 

.0468 

.0559 

.0648 

.0735 

.0821 

.0905 

.0989 

141 

142 

.0161 

.0271 

.0370 

.0464 

.0656 

.0643 

.0730 

.0815 

.0899 

.0982 

142 

143 

.0160 

.0269 

.0368 

.0461 

.0551 

.0639 

.0725 

.0809 

.0893 

.0975 

143 

144 

0159 

.0267 

0365 

.0458 

.0547 

.0635 

0720 

.0804 

.0887 

,0969 

144 

145 

.0158 

0266 

.0363 

.0455 

.0544 

.0630 

.0715 

.0798 

.0881 

.0962 

145 

146 

.0156 

.0264 

0360 

.0452 

0540 

,0626 

.0710 

.0793 

0875 

.0956 

146 

147 

0155 

.0262 

.0358 

.0449 

.0636 

.0622 

.0705 

.0788 

.0869 

.0949 

147 

148 

.0154 

.0260 

.0356 

.0446 

,0533 

.0618 

.0701 

.0783 

.0863 

.0943 

148 

149 

.0153 

0259 

0353 

.0443 

.0529 

.0614 

.0696 

.0777 

.0868 

.0937 

149 

150 

j .0152 

.0257 

.0351 

.0440 

.0526 

0010 

.0692 

,0772 

.0852 

.0931 

150 


TABLE III 


(Based on Poisf^n approximation to ike hinoinial distribution ) 


Acceptance Number 

Values of ai — npi for which 
P(c, Oi) = .96 

Values of at =* n-Pi for which 
P(c, at) =■ ,10 

0 

.05129 

2.303 

1 

.3654 

3.890 

2 

.8177 

6.322 

3 

1 366 

6.681 

4 

1.970 

7.994 

5 

2.613 

9.275 

6 

3.285 

10.53 

7 

3.981 

11.77 

8 

4.695 

12.99 

9 

5.425 

14.21 

10 

6.169 

15.41 

11 

6,924 

16.60 

12 

7.690 

17.78 

13 

8.464 

18.96 

14 

9.246 

20.13 

15 

10.04 

21.29 


UEFEUENCES 

[1] Cathbhine M TtroMPsoN, "Tables of percentage points of the incomplete beta func¬ 

tion,’’ Biomelrika, Vol. 32 (1944), pp, 161-181. 

[2] Maxine Mbeeinqton and Oathbbinb M. Thompson, "Tables of percentage points 

of the inverted beta {P) distribution,” Biomelnka, Vol. 33 (1946), pp 'fl-SS 

[3] Paul Peach and S. B. Littaube, "A. note on sampling inspection," Annals o/ Malh. 

Slat., Vol. 17 (1946), pp, 81-84. 



ON THE RANGE-MIDRANGE TEST AND SOME TESTS WITH BOUNDED 

SIGNIFICANCE LEVELS^ 

By John E. Walsh 
The RAND Corporation 

1. Summary. This paper ia divided into two parts. The significance tests 
investigated in Part I coneern the population mean and are based on the quantity 

[(sample midrauge)-(hypothetical mean)]/(sample range). 

The case in which the observations are a sample from a normal population is 
considered in detail. The testa investigated are summarized in Table I. These 
tests are found to be very efficient for small samples (see Table 4, power efficiency 
is defined in section 3). ^\jti investigation of several extremely non-normal 
populations using the valu(\s of Da obtained for normality indicates that the 
significance level of the range-midrange test is not very sensitive to the require¬ 
ment of normality for small samples (see Table 6). Also the testa of Table 1 
can be applied without computation through the use of an easily constructed 
graph (see section 4). These properties suggest that the range-midrange test is 
preferable to the Student t-test and the analogue of the Student J-test using the 
sample range (sec [1] and [2]) whenever the sample size is sufficiently small. 

Use of the range-midrange test for the cmc of normality was proposed by E. S. 
Pearson in [3], where properties of the test were experimentally investigated 
for the normal and certain non-normal populations. 

In Part II several significance tests for the mean are developed which have a 
specified significance level for the case of a sample from a normal population 
but whose significance level is bounded near the specified value under very 
general conditions, one of which is that the observations are from continuous 
symmetrical populations. Some of those tests are range-midrange tests. Table 
2 contains a summary of the tests and their properties (a:,' = fth largest observa¬ 
tion, t = 1, • ■ • , n; conditions (D) are given in section 7). 

PART I. THE RANGE-MIDRANGE TEST 

2. Introduction. In 1929 E. S. Pearson proposed using the range-midrange 
test for the case of a sample from a normal population (see [3]) and experi¬ 
mentally investigated some of its properties for sample sizes of 5 and 10 and 
significance levels of 2% and 10% (symmetrical tests). Using the constants 
(corresponding to the D„ in this paper) determined for the case of normality, 

'■ This paper was presented to a joint meeting of the Institute of Mathematical Statistics 
and the American Mathematical Society at New Haven, Conn, in September, 1947. The 
results presented in this paper were obtained in the course of research conducted under the 
sponsorship of the Office of Naval Research, This research was performed while the 
author was at Princeton University. 


267 



258 


JOHN B. WAI/80 


Significance level and power function properties of these four tests were experi¬ 
mentally investigated for several non-normal populations. The results of this 
empirical investigation indicated that the range-midrange test is very efficient 
for normality and not very sensitive to the assumption of normality if the sample 
size is sufficiently small. 

This paper piesents an analytical investigation of properties of the range- 
midrange test for n = 2, 3, ■ • • , 10 and a wide range of significance levels. 
The results of this investigation confirm the contention that the range-midrange 
test is very efficient for normality and small samples; also an analytical investiga¬ 
tion of how the significance level changes for the case of certain extremely 
non-normal populations furnishes results which agree with the contention 
that the range-midrange test is not very sensitive to the requirement of normality 
for sufficiently small samples. 

In most cases the results presented in this paper are not directly comparable 
with those obtained by Pearson. It was possible, however, to obtain values of 
Da, (a = 5%, 1%-, n = 5, 10), from the results presented in [3]; these values 
were found to be in close agreement with the corresponding values of Table 5. 

3. Efficiency of range-midrange. The purpose of this section is to use the 
relations derived in section 6 to determine the power efficiencies of tests A, B 
and C (see Table 1) for a = 1%, 5% and n = 2, • • • , 10. To do this the method 
of defining power efficiency given in [4] and [5] will be used. As shown in [6], 
it is sufficient to consider only test A; for any fixed n and «, tests A, B and C all 
have the same power efficiency (note that the significance level of test C is 2a). 

For a normal population (unknown variance) the most powerful test of the 
one-sided alternative n < naia the appropriate Student f-test. The procedure 
used in determining the power efficiency of test A consists in first computing the 
power function of test A for the given values of n and a; then the sample size 
of the corresponding Student i-test at this significance level is varied until the 
power function of the 1-test is approximately equal to that of test A. The size 
sample (not necessarily integral) thus obtained for the 1-test divided by n is 
called the power efficiency of test A for the given values of n and o. Intuitively 
the power efficiency of a test measures the percentage of the total available 
information per observation which is being utilized by that test. 

Table 3 contains values of the power function for test A. These values were 
computed from equation (3) of section 6 by approximate integration, 

The corresponding values of the poAver function for the Student 1-test were 
found by using the normal approximation given in [6]. This approximation 
was used for fractional degrees of freedom. The sample sizes considered as 
well as the resulting power function values are listed in Table 3. A comparison 
of the power function values for the two types of tests furnishes the approximate 
power efficiencies listed in Table 3. 

Forn — 2, test A is itself a Student <-test. The power efficiency is therefore 
100% for that sample size. This combined with Table 3 furnishes power 



RANGE-MIDEANGE TEST 


259 


efficiencies at the 1 % level for n = 2, 0,8,10 and at the 5 % level for n = 2, 6, 10. 
The approximate power efficiencies given in Table 4 for other values of n were 
obtained from these values by graphical interpolation. 

Table 4 shows that the power efficiency for a = 1% is very good for n < 8, 
while for a — 5% the efficiency is good for n < 6. 

TABLE 1 


Summary of range-midrange tests 


! 

Deflnitiona 


Testa 

Signifi- 

Aooapt 

K 

Level 

Test based on sample of size n, (2<n< 

10), from an arbitrary normal popula¬ 
tion. 

xi - smallest sample value. 

Xn = greatest sample value. 

(A) 

M<Mo 

to 

A 

1 

a 

p =■ the mean of the normal population. 

Mo =” given hypothetical mean value to be 
tested. 

^ ^ (sample midrange)-(hypothetical mean) 

(B) 

M>Mo 


D>Da 

<x 

(sample range) 

= [(Sn -f Xi)/2 - Mo]/(®» - Xi). 

D„ = constant depending on n and a. 

Values of a versus Da for 2<n<10 and 
a = 6%, 2.6%, 1%, 0.6% are given in 
Table 5. 





(C) 

\D\>Da 

2a 


4, Construction of graph. In most problems to which a test of the type 
developed in this paper would be applied, the values of the sample can be con¬ 
sidered to have practical lower and upper limits, say a and b. For example, in 
many situations zero is a lower limit for the sample values. From a practical 
vie^vpoint these limits on the sample values do not contradict the assumption 
that the population is normal, since the area under that part of the normal 
distribution which lies outside the interval (a, h) can be considered negiigible. 
Thus, since Pr(u/t)“ ^ w) = Pr(M ^ v%), test A can be restated in the form 
Accept p < po if the sample point (xi , z„) falls in the region (A) of the xi , x„ 








TABLE 2 

Some oue-stded and symmetrical tests with bounded 


260 


JOHN E. WAIiSII 




I1ANC3E-MIDRANGE TEST 


261 


TABLE 3 

Power function values for test A 


Typ® 

Test 

Semple 

Approx, 

Efficiency 

Signiftcanco 

Approximate Values of Power Function 

Sise 

Level 

m 

6 - 1 

g =• 

S w* 2 

6 = 21 



% 

mhh 






t 

5.4 



.244 

.607 

.886 

.969 


A 

6 

00 


.259 

.599 

.868 

.967 


t 

MM 



.333 


.971 



A 

M 

75 

Hi 

.351 


.962 


1 

t 

5.88 


.01 


.248 

.551 

1 1 

.820 

.957 

A 

6 

98 

.01 

mjm 

.271 

.568 

.809 

.935 

1 

7.2 


— 


.371 

.749 

.949 


A 

8 

90 

mm 


.389 

.728 

.923 


t 

mm 


.01 

.108 

.453 

.832 

.976 


A 


80 

.01 

.124 

.462 

.814 

.963 



TABLE 4 

Power efficiencies of iests A, B and C for a = 5%, 1% and S<n<10 
1 n 


a 

2 

3 

4 

1 e 

6 

7 

B 

9 

10 

.01 

100% 

99.7% 

99.4% 

09% 

98% 

96% 

90% 

85% 

80% 

.06 

100% 

98.6% 

96% 

93.5% 

90% 

86.5% 

82.5%: 

78.5% 

76% 


TABLE 6 

Approximale values of Da for a — 5%, B.6%, 1%, 0.5% and S<n<10 


n 


<X 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0.6% 

31.83 

3.02* 

1.37* 

.86* 

.66 

.65* 

.475 

.425 

.39* 

1% 

15.91 

2.11* 

1.04* 

.71 

,66* 

.47, 

.42* 

.38 

.36* 

2.6% 

6.36 

1.30 

.74 

.62 

.43 

.37, 

.33 

.30 

.275 

5% 

3.16 

.90* 

.55s* 

.425 

.36* 

.30 

.26s 

.24 

.225* 


* These values of were verified directly by substitutiou and integration. 
The remaining values of Z)„ for 3 g n g 10 were obtained from these and other 
values of £)„, (a ± ,006, .01, .026, .06), by graphical interpolation. 



































262 


JOHN a. WALSH 


pZane defined by 

(1/2 + I>a)x„ + (1/2 - D„)xi < no, Xn> Xi, a < xi, x„ < b. 

TABLE 6 

Effect of Tion-normahly on the significance level of the range-midrange test 


significance Level 




.010 064 ,039 . 018 . 010 .128 . 078 . 030 020 


.0096 .063 .033 . 017 , 0096 .100 . 066 . 034 . 0192 


,015 . 0094 .086 .058 030 . 0188 


.0031 036 .017 .0063 . 0031 .072 . 034 . 0126 , 0062 


.0024 .043 016 .0055 . 0024 . 086 . 032 . 0101 .0018 


.0027 .095 .026 . 0069 . 0027 .190 052 , 0118 0064 


073 . 050 .119 .104 . 073 . 050 . 238 . 208 .146 ,100 


.002 061 .055 015 .124 .122 .110 . 090 


031 .029 . 031 .031 .031 .029 . 062 . 062 . 062 . 068 


.069 .036 .172 .116 .062 .036 


0016 0007 .144 ,104 . 066 042 167 .109 . 067 . 043 


0006 .122 .096 . 061 .046 .139 .102 . 062 . 046 


,030 ,017 .131 .080 .038 ,021 


.083 066 .031 .018 .114 .071 .038 021 


,0031 .068 


.019 . 096 . 065 . 032 . 020 


027 014 , 0063 ,0026 .112 072 037 . 021 .139 , 086 . 042 , 024 


.0019 . 009 .067 . 039 .024 .128 . 078 . 048 . 026 




Likewise test B can be restated as 
Accept n > no if (xi, x„) falls in the region (B) defined by 

( 1/2 — Da)Xn 4- ( 1/2 + I)a)Xi > no , X„ > Xi , a < Xi , Xn < b. 

















































































































































































HANaE-MIDRANSB TEST 


263 


Test C now tecomea 

AccRpl II ^ IH if ixi, ain) falls in either of the regions (^) or (J5). 

Figure 1 (i) contains a schematic diagram of the regions (A) and (B). Test A 
can be applied by eonatnicting a graph of the region (^) and giving the instruc¬ 
tions to accept ^ < MO if (ii, x„) falls in (yl). Similarly for test B and region (B) 
Test C is applied by constructing a graph of both (A) and (B) and accepting 
p noil (xi, Xn) falls in either (A) or (B). 

Frequently it is desirable to simultaneoualy consider more than one significance 
level. This can be accomplished in the manner indicated by Figure l(ii), 

6. Effect of non-normality on significance level. It has been shown that the 
range-midrange test compares very favorably with the Student f-test for suffi¬ 




ciently small samples and normality. In practice, however, it may happen that 
normality is assumed for cases in which the population is not even approximately 
normal. Although this represents an error in judgment on the part of the 
person applying the test, such situations will undoubtedly occur if the range- 
midrange tegt is used very frequently. The purpose of this section is to 
investigate the effect of non-normality on the significance level of the ranger 
midrange teat when the values of Da based on normality are used. The cor¬ 
responding effect of these non-normal populations on the significance level of 
the f-test was not considered because of computational difficulties; however the 
effect of some other non-normal populations on the significance level of the 
t-test was experimentally investigated by Pearson in [3], The results of this 
empirical investigation and of later investigations shows that the significance 
level of the f-test is not very sensitive to the requirement of normality for small 
samples. 

Six populations were chosen for investigation. Three of these populations are 



264 


JOHN E. WAI/SH 


symmetrical while the remaining three are strongly asymmetricial. These 
particular populations were considered because their probability density funC’ 
tions have a wide variety of diflferent shapes; also because the significance level 
of the range-midrange test can be computed in closed form for these populations. 

The populations investigated are defined by their probability density functions. 
Table 6 contains a list of the probability density functions considered along with 
the resulting significance levels for the range-midrange teat. The cases in¬ 
vestigated are w = 3, 4, 6 and a ■= 6%, 2.5%, 1%, 0.5%. Larger values of 
n were not used because of computational difficulties. The situation of n »» 2 
was not considered because the ^-test and the range-midrange test are identical 
for this case. The significance levels of Table 6 were computed by making 
direct application of (1) and (2) of section 6. 


6. Significance level and power function derivations. The purpose of this , 
section is to present derivations of the significance level and power function 
expressions which were used in the preceding sections. First a general probabil¬ 
ity expression will be evaluated. Direct applications of the results obtained 
for this expression yield the required significance level and power function 
relations. 

Let xi and Xn be the smallest and largest values, respectively, of a sample of 
size n drawn from a population with probability density function f(x) The 
non-zero probability range of this population iB y < x < Also let three 
constants ci, c„, co, (ci + Cn == l)i bo given and consider the value of 

Pr (ciii-f c„x„ < Co); where Af(z) = [ fiy) dy. 

J—eo 

Using direct methods it is found that the value of this expression is given by 
[Af(ci))]'' if Cl = 0. 


M 




(Oq—C i7)/Ort 


(1) 1 - [1 - M(co)]" 

0 


ifO<ci<l,co<7 

\^M{V) - M V(V) dV 

if 0 < Cl < 1, Co > 7. 
if Cl 1 

if Cl > 1, Co min [ 7 , Ci 7 + c„d]. 


1 


-nf \m{V) - M hdV 
•’("O-Unl'l/On L \ Cl / J 

- M if >1. Cir -f c„/3 < Co < 

- n - M - 


m dV if Cl > 1, 


Co > 7 . 



BANGB-MIDRANGE TEST 


266 


The value of Vr(ctXy + < co) for c, < 0 can be obtained from the above 

results for Cl > 1. It is easily shovra that 

(2) Pr(cja:i + c„x„ < co) = 1 — Pr(ci 2 /i — < cp), 

where 

Cl "= C„ , On = Cl , ci = —Co , 

and 1 / 1 , Vn are the smallest and largest values, respectively, of a sample of size n 
drawn from a population with probability fimction g(y) = f(—y). Thus if 
Cl < 0, cj *= c„ > 1 and obvious modifications of the results for Ci > 1 will 
furnish the value of Pr(ciVi + ciy* < c'a). 

The above general results were used in section 5 to investigate the effect of 
non-normality on the significance leAml of the range-midrange test 
Now consider the case in which the n sample values are drawn from a normal 
population with mean n and variance <r*. Then, for test A, 

Power Function = Pr((I/2 - D„)xi -f- (1/2 + < ;«,) 

= Pr(a/2 - Djzi -f- (1/2 -f D.K < «), 

where 


Zi « (Xi — ii)/<r, = (Xn — m)/o’, B = {ho — n)/(r. 

Using the above results with 

/(,) . ^ MW . NM - 

it is found that the power function for test A is 




N(V) - N 


in-l 


(3) [Ar(J)]’' if D, = 1/2; 

»£[ArW-Ar{l^ll^i^}]""/(F)dr, if D.>1A 

The value of Da (for given n) corresponding to a specified significance level a 
for test A is obtained by solving the equation 


(4) 


a - Pz(0), 


where Pa (5) is the power function for teat A, From symmetry and the fact that 
test C is a combination of tests A and B, test B has significance level a and test 
C significance level 2« for this value of D, . 

For n » 2, test A becomes a Student f-test Muth one degree of freedom if Da 
is replaced by fa/2. The relation Da = <a/2 gives an easily applied method of 
computing Da for this case. 

Approximate values of Da for a = 6%, 2.5%, 1%, 0.5% are contained in 



266 


JOHN E. WALSH 


Table 6 for 2 < n < 10. For 3 < n < 10, these values were obtained from (3) 
and (4) by approximate integration and interpolation. For n =* 2, the relation 
between Da and ta was used, 

PART II. SOME TESTS WITH BOUNDED SIGNIFICANCE LEVELS 

7. Introduction. In this part some significance tests (for the mean) are 
derived which are based on the assumption of a sample from a normal population. 
These tests have the property that the significance level is bounded near the 
value for normality under very general conditions. Those conditions are 

f (a) The observations used for a test are independent. 

(b) Each obsen^ation comes from a continuous sjunmetrical population 
ivith mean n. 

It is to be emphasized that no two obsen'ations are necessarily drawn from 
the same population 

The bounded significance level tests developed are summarized in Table 2. 
These tests can be used to supplement the tests presented in [5] for n < 9, where 
the tests of [5] do not furnish a very wdc variety of suitable significance levels. 

8. Outline of derivations. Let us consider the range-midrange test for the 
more general situation in which the set of independent observations used are from 
arbitrary but fixed populations satisfying conditions (D). Let Da be redefined 
so that the resulting test A has significance level a. Then it is easily seen that 
D„ is a monotone decreasing function «. Thus the significance level of the 
modified test A will always be less than or equal to (1/2)" if D« > 1/2. The 
significance level bounds for the tests n = 4, a = 6%;n = 6, a = 2.6%; n, « 6, 
« = 1%, n = 7, « = 0.6% of Table 2 were obtained from this relation and 
obvious significance level relations among tests A, B and C. 

The significance levels (for normality) for the tests n = 5, a «= 6%; n “ 6, 
a = 2.5%; n = 7, a = 1%; = 8, a = 0.5% were obtained by approximate 

integration of the expression derived for Pr[(l/2 + c)i„ + (1/2 — c)a:„_j < n], 
(0 < c < 1/2), for several values of c and then graphical interpolation (here a 
is the one-sided test significance level). The significance level bounds were 
determined from 

(1/2)" = Pr(j:„ < /.) < Pr[(l/2 + c)Xa -f (1/2 - c)a;„_x < Ji] 

< Prr(l/2)(*« + r„_i) < Ml - (1/2)"-' 

The significance levels for the tests n = 8, «"=l%;n«9, «•» 0.6% were 
obtained by considering the relations 

Pr{max , (a:„ -f a:„_,)/2] < ^1 = (1 + i)(l/2)", (i - 0, 1, 2, 3), 
and applying linear interpolation to find a value c, (0 < c < 1/2), such that 



RAMGE'MIMGE TEST 


267 


Pr(iiiax [ain-i, O.SXa + a«»j + (1/2 - c)a:n-i] < ;il lias the desired value. 
The signifieancp level bounds were found from 

Pr((l/2)(3:„ + < m 1 < Pr(raax|ji„.i, 0,5a:„ + cx „_2 + (i - c)a;„-J < /i| 

<Pr|max[vi,(l/2)(a)„ + x„- 2 )l</i|. 

The derivation of the power efficiencies listed in Table 2 will not be considered 
here. Detailed derivations can be found in [71. 

REFERENCES 

[1] J F. Daly, "On the use of the sample range in an analogue of Students’ t4eat/’ Aimk 
o/AM.M,Vol. 17 (IMfl), pp. 71-74. 

12] E. Lobd, "The use of range in place of standard deviation in t4e8t,’’ Biomlriht Vol 34 
(1947), pp. 4H7, 

[3] E. S. Feahson, "The distribution of frequency constants in small samples from non- 

normal symmetricaUnd skew populations,” Rfomcfrifco, Vol 21 (1929),pp 280-286 

[4] Jon.v E. Walsh, "On the power function of the sign test for slippage of means,” Anmh 

o/ilfal/i.M,VoI.17 (1946), pp. 358-362. 

[6] John E. Walsh, "Some sipifioance tests for the median which are valid under very 
general conditions,” Annals oj Mail Slat>, Vol. 20 (1949), pp, 64781. 

610“6U. Submitted for publication in Annak of Math. Slat 

[6] N. L, Johnson anb B. L Welch, "Applications of the non-central f-distribution,” 

BioMtlrik, Vol, 31 (1940), p. 376. 

(7) John E. Walsh, "Some sipificanoe tests for the median which are valid under very 

pneral conditions,” mpblM Ihens, Princeton Unmrsitjj Library, Princeton, 
N. J. 



ASYMPTOTIC STUDENTIZATION IN TESTING OF HYPOTHESES 
By Herman Chernoit' 

Cowles CotnmisBion for Research in Ecotuymics 

1. Summary. A method suggested by Wald for finding eritieal regions of 
almost constant size and various modifications arc considered. Under reasonable 
conditions the sth step of this method gives a critical region of size a + R,(d) 
where 6 is the unknown value of the nuisance parameter, Rt(d) » 0(1^'^) miN 
is the sample size. The first step of this method gives tlie region which is 
obtained by assuming that an estimate § of the nuisance parameter is actually 
equal to 6. 

2. Introduction. The problem of nuisance parameters often arises in the 
testing of hypotheses in the following form: It is desired to construct a test of a 
hypothesis H so that the probability of rejecting H if it is true is equal to a. 
However the probability distribution of the data is not uniquely determined 
by H. Indeed, if the hypothesis is true then the observations have a distribution 
depending on a nuisance parameter 6 whose value is not known. Generally a 
critical region will have a size which depends on the value of 6. Neyman has 
done considerable work on the problem of finding similar regions, i.e., regions 
whose size is independent of 6. 

Wald has suggested the following method of finding critical regions whose 
size is almost independent of 6. Suppose that I is a statistic such that if 9 
were known then the critical region t < ci{d) would be a good critical region 
for testing the hypothesis H, Suppose also that d is an estimate of 9 and that 
g{t, S I 9) represents the joint distribution of <, $ under E when 9 is the value 
of the nuisance parameter. Then consider the regions 

i < ci(^) where Pr[f < ci(9)} = a independent of 9; 

i < ci(^) + cafd) “ Pr(t — ci(^) < Ci{9)\ = a independent of 9\ 

i ^ ci(^) -f- ■ • -f- c,(^) “ Prfi — Ci(§) ' • • —c,_i($) ^ Ci(9)} = a 

independent of 9, 

Under the assumption that h is close to 9 it is reasonable to expect that 
Pr(i < Ci(^)} would be close to a. It might also be expected that 
Pr{i ;< ci(^) + C 2 (^)} would be even closer to a. 

This method has been shown to have good properties when considered from 
the asymptotic point of view. Suppose that i, d are two sequences of statistics 

1 This paper is baaed on a dieaertation written under the supervision of Professor Abra¬ 
ham Wald and submitted as partial fulfilment of the requirements for Ph.D. in the Gradu¬ 
ate Division of Applied Mathematics of Brown tTniversity. 

268 ■ 



ASYMPTOTIC STUDENTIZATION 


269 


(depending on N, the size of the sample or an analogous variable) with distribu¬ 
tion represented by g{t, d \ 6) where N is understood to be present. Then it has 
been shown that under reasonable conditions, with modifications for the sake of 
calculation, 

1 < ci(^) + ■■■ + c.(^)} - a 1 = 0{N-‘") 

The statement of the theorem presenting this result will be given in section 4. 
It has also been shown that if roughly speaking ^ is distributed almost sym¬ 
metrically about Q, the above result may be obtained in half the steps, i.e., 

I Pr{l < ci(d) + ... + C.0)} - I = 

It is true that under relatively weak conditions and for fixed N it is possible for 
any e > 0 to obtain a function h0) such that | Pr{t < h{^)\ — a j < e. However 
such a critical region can have very poor properties from the point of view of the 
alternative hypotheses especially if h0) is a very wildly oscillating function. 
On the other hand this objection does not apply to Wald’s method for large N 
because 

I ci'^e) \<M r = 0, 1, .. • , s; 

\ ci'\6) \ < r = 0,1, ... ,s - V, 

I cl'\e) I < r = 0, 1, 

and hence Ci0) -b ■ ■ • -b Ci(^) is almost constant over “that small range in 
which d will probably fall.” 

In the above it has been implied that d? is a one dimensional variable. However 
the results are easily extended to the case where 0 is a fc-dimensional variable. 

The direct application of the method is often quite difficult because of the 
calculations involved. Modifications can be applied which simplify the cal¬ 
culations. Such modification usually consist of changing the Cr(6) by a small 
amount provided the remainder is simple and “well behaved." A case where 
considerable simplifications can be made is that where gi(t | 6), the conditional 

distribution of i, can be expanded in a Taylor Expansion, 

gi{t 1 S,6) = gi(ci(S) \ B,B) 4- {i ~ ci(0)) ~ 

where the partial derivatives “behave.” This case will be described in detail in 
section 3, and an example previously treated by Welch (see [1]) will be discussed 
in section 4, 

Another case where simplifications often arise is the asymptotic case, that is 
the case where g{i, ^\d) has an asymptotic expansion. The asymptotic case 



270 


HEHMAN CHEENOFI’ 


may also be regarded as an extension of the following partition prmoiple which 
is very useful. If ^(i, ^ | 9 ) = ^o(i, h{t, 1 fl) and jl\h\did^< MN~‘'^ 

and if ^( 5 ) is such that 

I /"pW I 

dfl / dlgo(.tJ\e)- «| < MN-'\ 

then I Pr(i < - a | < Thus our theorems apply to g{l, h\ &) 

g = go -h h where go has sufficient differentiability properties. 


3 . The Taylor expansion treatment. Let g(tJle) = §^(1 [ S, d)gi 0 \ d) where 
gi is the conditional density of t given S and g20 | e) is the marginal density of 6 . 

gi{t I e) = ^ I is the marginal density of i. In what follows we shnll 

use M as a generic bound. Thus the statement/(<, 6) < M{ei , ffj), 6i< 6 < Si, 
means that there is a constant M depending on (8i, 8^) and independent of 
i, 8, N so that/(t, 8) < M(ei, St) 8i < 8 < 82 . 

First we obtain ci( 9 ) so that Pr{t < ci( 9 )| = a. 

Then we have 

Theorem 1. If for every finite interval (81, 82), 


(i) 


*'(<I« + A) 


<Gi(t,e) <Gi(t), |h| < p = 0 ,l,-.-s, 


ei< 8,8 + A^Bi, 

where dt < M{8i, 82), G^ and G2 may depend on N, di, and 6% 


.... \ 8 ) . , 

Ov — 88 ^— cnntenuowa zn t, 8 and 


bounded in absolute value by M(Ci , Ct, 8t, 82) for p + q < s, Bi < 8 < 82, 
Cl :< t < Cj; 

1 

01 , fl,) < 82<8<d2,C2<t<C2; 

(iv) 0 < a < 1 , 

then Pr{i < ci( 0 )} = a defines ci{8) uniquely and so that | Ci‘’’^( 9 ) 1 < Jlf (fli. 82) 

for V = 0,1, ■■■ ,s 82< 8 < 8i. 

Proof. Since f,(t | 8) is positive, ci( 0 ) is uniquely deffned by condition (i). 
From this and conditions (i) and (ii) it follows that ci'( 0 ) exists and is given by 

« I + o[{d)gi{ci(d) 1 8). 



ASYMPTOTIC STUDBNTIZATION 


271 


We may continue in this fashion differentiating formally p < s times to get 

( 2 ) + E[ci^‘’f 0 )]''[c{'»’(«)]’’ •• • 

+ ci’’’(fll^73(ci(fl) I = 0 , jt,ji, ,ih,i+j <p. 

From the continuity and positiveness it follows that c^’’\e) is continuous. Since 

/ w 

(?s (0 dt < M{9i., 6i) it folloM'S that there is a constant Midi, 6^) so that 

no 

r « 

Gj(0 dt < a, / t?s(0 dt < 1 — a. 

<o Jut.h.h) 


Thus 


From (1) and condition (i) it follows easily that | Ci(S) | < Af(fli, ^ 2 ). Similarly 
we obtain ] I ^ ^(9 l, 62 ) for < 5 < ^2 • 

While the conditions (i) to (iv) suffice to insure the results of the theorem 
they are not necessary. It is often possible to obtain these properties of ci(e) 
in particular examples where ^j(t, 9) does vanish at points so long as P 2 (ci(fi), 9) 
behaves well. 

Dbfinition 1 . is an admissible function of order m{m < s, a fixed in 

advance) if Vm{^) ” + • •' + Cn 0 ) where Pr{t < Ci( 0 )} = a and 

( 3 ) 1 cl'V) 1 ^ ■^(^ 11 p 0 , 1 , ■ , 8 + 1 - i, < 9 < 9 t. 


Now let 

(4) H„{9) - W*'* E($ - fl)* “ iV*'* - 9)’’ff20 I 9) dh and 


ar+« 

(5) Clfqifi) = ^ I*—• 


We have 
Theobbm 2. If 

(i) Pr{t < ci(ff)) - a, 0 < « < 1, and | cl'’’(0) 1 < , ^ 2 ), 

9i^ 9 <9x,p=‘0,1,,8\ 

(ii) S " 5{N) - 0(1) is a function of N awdi that 



M I '') ^ ^(<^1 * 9,)Ar-'\ 9i-^9 <9i,k^0,l,--- ,a-. 


(iii) I ■^2 giU \6,9)\<Mi9x, 9,), p + q = s, 

1 1 - ci( 9 ) \<p,\^~ 9 \<S, 



272 


HERMAN CHERNOFP 


where 

p = Max. 1 Cl (^) — ci(^) I + N »? > 0, 0i < 0 < 62 

(iv) I Hi’-He) I < M(di ,82) for p = 0,1, ,8 - k,k == I, ,8, 

ei< 8 <e,-, 

(v) |G^“W|<M(0i,0a) for I = 0,1, ■■■ ,s-p- q, 

p + q < s - 1-, 

(vi) (pm0) is an admissible function of order m < s, 
then 

(6) Prli < } = « + r„i(@)7r“^ + • • • + r„.(0)ir' 

where 

I riffie) I < M{e, ,62) for p = 0, 1, • • • , 6 - j, j < a, 6i<8 <82. 

Proof. Expand giif | h, 8) in a Taylor Expansion about t — Ci(fl), ^ = 8, 
with remainder terms of order s in t — Ci{6), ^ — 8, and expand c,(8) about. 
S = 6 where the remainder term is of order s + 1 — i, Then for | ^ — ^ | < 5, 
we have 

( 7 ) r''' \^s)dt = Pl0- ey, cyHe), ojl-h RN-'^, 

where P is a polynomial and \R \ < M(0i, 02)2 (5 — 8)' *N~'^^ioT [ S — 6 | < S. 

<■-0 

Integrating over | ^ — 0 | < 5, we use conditions (ii), (iv) and (v) and the 
theorem follows. By a similar argument we have 
Theorem 3. If 

(i) the conditions of Theorem 2 hold for each (61, 82) so that 

— CO < |3i < 01 < 02 < /3j < CO 

and 

(ii) gi{ci(d) \e,8)> { 1 /M{B,, 82)) >0, 8i< 8< 82, 

then the sequence 

<Pi0) = ci(^); 

(g) <P^0) = ci(^) - ri,i(^)i\r‘'*; 

is a sequence of admissible functions such that 
(a) Pr{< < vn.(^)} = « + R(6)N~”'>\ 


m < Sy 



ASYMPTCmc STUDENTIZATION 


273 


where \ R{ 6 ) \ < , &%) for < Oi < 0 < 62 < . 

These theorems permit us to obtain and to calculate critical regions whose 
■size is asymptotically close to a. 

In Theorem 2, condition (ii) was much stronger than necessary. It may be 
relaxed if we define 


ffk(e) - N’‘%0 1 8)(6 - efdd, 

where 

Pr{| d - I > «} < Mid, , 02 )ir‘'\ S = &iN) = 0(1). 

However this may complicate the calculations. 

The symmetric case arises when the first moment almost vanishes, i.e. 

(10) I I < MiO,. P = 0, 1, ... . s - 1, B,<d<e2. 


In this case we have instead of the sequence given in Theorem 3, the sequence 

ip, 0 ) == Ci(^); 

( 11 ) ’ 






which is a sequence of admissible functions such that 

Pr{f < v«(^)} « a + r„ami9)N~'” + • • • + r„,.iB)N-’'^ 

I rl^liO) I < MiB, ,62) 6,<e<B2 p = 0 , 1 , • • • , s - n. 


4. An example. The following example previously treated by Welch from a 
different point of view will furnish an illustration of the applicability of the 
theorems to the case where 6 ia b. k dimensional parameter. It will also serve 
as an example of an extended type of symmetry. That is, it has the property 
that 1 Hil^iiB) I < M(6,, 5j)iV“'^*, and hence, in the sequence ( 11 ), the rm, 2 m+i( 0 ) 
terms effectively vanish thereby simplifying the calculations considerably 
We suppose that i is a normally distributed variable with mean n and variance 
<r = Xicri -f- ... 4 - X*(r* where the are known positive constants, the el are 
unknown parameters each of which is independently estimated by s* where 
iViS?/o-( has the x* distribution with Nt degrees of freedom. It is desired to test 
the hypothesis that p •= 0 so that the probability of rejecting the hypothesis 
if it is true should equal a. Under the hypothesis the joint density distribution 
of f, sj, • • • Si, is given by 




274 


HERMAN CHERNOFP 


■where the momenta of s? - rj = - Qt are given by the coefficients of uVA;! 

in the expansion about u = 0 of — (2u(rf/iV’,))~^'^^: 

= 0 , 

lh{a]) = 2cri ; 

= (V + 2iV7V? • 

We define ci(5) by Pr(i < ci(fl)} = a where 

6 = (<ri, ffa, • • • , cr*) and h — (si, bL • ‘' I s*)j 


Ci{0) = Ci<r. 

Now ai{ 6 ) — a = Pr(c;(9) < t < Ci0)] may be computed within terms of order 
by expanding 

ci0) i=a ciff + Cl 22^ ^i(s( - oi) ~ Cl 12 o— <r]) 


8 (r» 




■whence 


ai{ 9 ) — a l=si dal ••• dslXL | <ri j fV^<) jy'^irp} ^ ~ 

- 22 - <rj)(s/ ~ ~ ^ 22 - <r?)(«/ - crj)^| 


Thus 


and 


where 


Ca(9)="-i^‘ExWiV7’ 


«a(fl) = Pr(/ < ciS + Z = « + 0(Z^), 


22 ■ 


Further approximations become somewhat complex and should be carried out 
in a systematic fashion. 


6. Remarks. The range of application in practical statistical problems of the 
theorems of section 2 may be somewhat more limited than that of the original 



ASYMPTOTIC STUDBNTIZATION 


275 


method proposed by Wald. Concerning the original method, the following 
theorema have been established. 

Theorem 4 . If 

(i) Pr{t < Ci( 3 )l *=■ a, 0 < a < 1, where | cP’(9) | < M(di, Bf), By < B < 82 , 


(ii) 


(iii) 


p = 0, 1, 


I ^ + A I 0 4- A) 

' al'aA' ■ 


I < 00, 6 ) fori + j < s - 1 , Cl < t < Ci 


fii < 0, a + A < ft!, I A I < A', where 00, 0) depends on Ci ,Ci, Bi, 82 ,N ^ 

I 

and is integrdble in § over (— «>, «>); 




< L0, 8 ), i + j < s - 1, Cl < i < Ci , 61 < B < 82 , 


where £ L0, 0) M - 01* dd < M(0i, 0,, Ci, = 0, 1, 

(iv) 0 < A(Ci , 02 , 61 , 82 ) < A(t) < g,(t I 0) < Bit) < BiCi , , ft , ft) < «; 

01 < 0 < 02, Cl < « < Cj, 

I Bit) dl < Midi, 02); 

•/-oo 

90 ^ I 0) > 0, 

then a sequence c* 0), Ci0), c*(^), • • •, c*0), exists where Cm(0) is uniquely defined in 
01 , 02 ) HipPr{J — cf (d) — ■ • • —c^-i(^) < c».(0)) = «, and 

Id”’(0)1 < M(ft, 0,)fr^’"“'’'' p = 0,1, • • • , 5 - m + 1, ft < 0 < 02 
and c^(0) is any function so that ° 

|c:‘’’(0) - ci'’(0) I < Mir’"'* /or ft < 0 < 02 , p = 0, 1, • • • , s - m, 
and 

I 1 ^ M 01 , 02)ir""~'’'* - 00 < 0 < CO, p = 0, 1, ■ ■ • , s - wi + T 

Finally for Cm*(0) arbitrary within the above conditions, 

1 Pr{t - ci*(0)-- c*0) < 0) - a I ^ M(ft, 02)ir"* for ft < 0 < 02_ 

The conditions on the derivatives with respect to A are natural because 
the intuitive approach to the method seems to hinge on the assumption that 
git, 0 4- AI 0 + A) changes gradually with respect to A “independent” of the 
value of N. This would not be true of git, 0 | 0 + A) for large N. 

The c* (0) where introduced in Theorem 4 because in practical examples it is 
usually found too difiBcult to compute Cj(0) efl&ciently. On the other hand 
there are many alternative ways of obtaining functions with the properties 





276 


HEBJUN CEEKNOFF 


of the ct(d). The cj(0), Ca(fl) etc. mentioned in Theorems 1, 2, 3 play the role 
of the 0 ^( 0 ) in Theorem 4 with the exception of the condition on cf(0) for outside 
( 01 , 02 ) The exception is due to the fact that the Theorems 1 , 2 , 3 correspond 
to the “infinite case.” Theorem 4 is applicable to those cases where one is 
willing to assume that 0 lies in ( 0 i, 0a). It often happens that there is no such 
reason or that the conditions of the theorem hold only for every closed proper 
subinterval of ( 01 , M bnt not for 0i < 0 ^ 02 itself. In these cases we may 
apply 

Theoeem 5. If 

(i) all of the conditions of Theorem 4 apply lo every finite proper closed stibinierval 
( 01 , 02 ) of ( 01 , 02 ) where (0i, 0a) may he an infinite interval; 

(ii) Pr(| § - 0 1 > S(JV)) < Midi, for 0i < 0i < 0 < 0a < 02 , where 

5(_N) = 0(1) unless 0i or 0a is finite, in which case S(H) = o(l), then a 

sequence ct{ 6 ), Ci{d), ct 0 ), , e* 0 ), exists, where c„,(d) is uniquely defined in 

(01,02) hyPr\i - ci* 0 ) ~ C2*0) - - c^-i 0 ) < Cm( 0 )) == a, so that for every 

(01 , 02 ), 

1 c^'^>( 0 ) 1 < M{8i, 0 a)ir^"'~“'’t/ 0 i < 0 i < 0 < 02 < 02 , 

p = 0,1, • • • , 8 — m 4-1' 

and for c^{d) arhiirary within the above conditions 

i Pr|f < c?(^) + • • • + 4(5)} - a I < M(0i, 0a)2r’"'=' 

if 01 < 01 < 0 < 02 < 02, m s. 

Essentially this theorem can be proved by reference to the proof of Theorem 4 
applied to the function 

g*{i, 5 I 0) = g(t, 5 1 0) for 1 5 - 0 1 < 5; 

= 0 I 5 - 01 > 5. 

Some of the conditions in Theorems 4 and 5 are stronger than necessary. For 
example p > 0 may be replaced by a weaker condition where g is positive in a 
region about i = ci( 0 ). On the other hand the condition Pr{( 5 -- 0 | > fi) < 
in Theorem 5 is necessary to the argument used in the proof, It is easy 
to construct trivial examples ivhere the results of this theorem apply although 
this condition is not satisfied. However an example has also been constructed 
where all the conditions of Theorem 5 hold except for this condition and the 
method of Wald fails to give the results. 

These theorems are very easily extended to the /c-dimensional parameter case 
by replacing the conditions on the derivatives with respect to A by the same 
order mixed derivatives with respect to Ai, Aj, • • • , of 

git, 01 "b Ai, ^2 Aa, • • ■ , 5* "b Aj I 01 4“ Ai, • • • , 0* 4" A^). 

The symmetric case arises when the distribution of 5 is almost symmetric 
about 0 . More exactly we have 



ASYMPTOTIC STUDENTIZATION 


277 


Theorem G. If 

(i) All the conditions of TheOteni 4 hold and Ij 0 , fl) has the additional •property that 

0 ~ OfL0, B) dS < Midi, 9i}N-\ 01 < 9 < 02, 

and 

^ i+i 

(ii) dS^di' ^ I“ 3^- (<. 20 - ^ I 0) < L(§, 9)\§ - 6\, 

Cl ^ t < Cl, Bi < 6 < 8i, i + i < s — 1 , 

then it is possible to construct a sequence c* 0), Ci0), • • • ,0^0), as in Theorem 4 
so that 

I ci^\e) I < Jl.f(0i , 

p = 0,1, ■ • ■ , s - 2 m + 2, 01 < fi < 0j; 
|c:<^’(0) -c<.’-'(0)|<M(0i.02W-”. 

p = 0, 1, • • • , s - 2 m + 1, 01 < 0 < 02 ; 

I I < M0i, 02)Ar^'-«, 

p = 0, 1, •••,s-2?n+2, - « <S< »; 
and 

I Pr« ^ cf (^) + • • • + cJC^)) - « I ^ Jtf(0i, 6i)N-‘'\ 

ei<9 <di,r= 

Theorem 5 can also bo extended to the symmetric case 
It is often possible in the theory of statistics to obtain an asymptotic expansion 
of the distribution of t, The treatment of such cases is often very simple 
because of the prominent role played by the normal distribution in such 
asymptotic expansions. Suppose that 

g{t, d I 0 ) = VW^ii, 1 0 ), 

where ^ = ■\/N0 — B);y = density distribution of (I, 

y(i, lA I 0) “ 7o(i, i;-1 0) 4- I 0) + • • ■ + ir'-'’'* ^ I 0) 

+ pit, i\e)N-’'\ 

7 o, Yi, • • • , 7.-1 are independent of JV; 

jI \p\d^dt< M0i, Bi), 01 < 0 < 02; 

J J 17i I # di < ^(0ii 02). 


01 < 0 < 02 . 



278 


HERMAN CHERNOPP 


CorrespondiDgly 'we have 

}((, j ID) = soft J I«)+r «I«) + • • ■ + 

I «) + KU I 


where 


},((,«I«) = Vff 7.(1, ^ 1«), fft ^ I") = f I«). 

/•nW) 

Then if we define cj(6) by I dfi I d% “ 

4 L 00 lUflO 

oM by f Sf 

J^qO JC 


l(0+"'+S)n-l(^) 


diffe 


.■c ■+«n_iw ,,, 

= « - ( dfl f 4- + ” ■ + 

*Lao 




or by 

CmW f ®l^) 

JLm 


-['A 




d%o + + ••■ 4- gm~iN' 




we obtain 

j Pr{f < Cl (H ■ • ‘ 4- c,0)} - « 1 I 

if g obeys the conditions of Theorem 4 except that we need only s i + 1 
derivatives for i \ 8). The above definitions of c«,(fl) correspond to the 
c^(6) in Theorem 4. Analogues of Theorems 5 and 6 also apply to the asymptotic 
case. 

REFERENCE 

[1] B L, Welch, "On the etudentization of several variances," Anmls 0 / Ualh, Stat., 
Vol.18 (1947), p, 118, 



SOME LOW MOMENTS OF ORDER STATISTICS 
By H. J. Godwin 
Univemly College of Swarisea, Wales 

L Inttoduction> In a paper on. order atatistica from, several populations 
fl], there were given, among other results, the means, variances, covariances, and 
correlations of order statistics in samples of ten or less from a normal population. 
These were obtained by numerical integration, and on account of the difficulties 
arising therefrom, some results were given to only two decimal places. More 
recently, Jones [.1] has sliown that some of the integrals, for sample sizes not 
greater than four, can be evaluated explicitly. 

In this note these results are supplemented in two ways. For a paper which 
the author has recently submitted to Biorntrika integrals were evaluated which 
can be used to give some of the results in [1] to more places of decimals. It is 
also shown that the table of explicit values can be extended. 

2 . Approximate values. Let the population studied be normal with mean 
zero and variance unity, and let the members of a sample of a be a;(l | n) > 
x(2 1 7 i) > •'' > x(n I a). The integrals available are 

f 09 

^ F‘(x)(l - F(z)y dx(l < i ^ 6 ), and 

F'(.t) J (I - P(y)) dx dyil < + 3 < 10 ), 

where 

These were evaluated to ten places of decimals, the last place possibly being in 
error by one or two units. 

For the purpose in hand we define also 

“ F{x)ydx = ~a{j,i), 

and 

Pihj) = f a!’‘f(x)F^(x)(l - F(x)ydz = 

J—bC 

Now, on integratiiig by parts, we have 

f (1 - F(y)) dy = -x(l - F{x)) -f \ yf{y) dy, 

279 



280 


H. J. GODWIN 


and for/(a:) as f’-'^ned above (so that in what follows we restrict ourselves to the 
normal distribution only), the second integral is/(a;). Hence ^(i, 1) + a(i, 1 ) = 
1 /(i + 1 ) and we can construct a table of a’s by using also the relation 

a{i,j) - a(i + l,j) = a(i,j + 1 ). 

Again, on integrating by parts, we have 

- 2xa - dxH > 0 ) 

J—«o Z “7" i 

= - 1, t - 1) - ^(i, i) } - a(i + 1, i), 

t -T 1 I “T 1 

using the fact that, in this particular case, 2 ^’ — 1 is an odd function and 
F(1 — F) an even function of x. 

Hence /3(f, i) = 2 ( 2 ^+ i) ~ 1) ^ - 1) “ 2 i'^rY "*■ 

fiih 3 ) - + 1, j) = Kh J + 1) we can find the /3’s. 

FinaUy we put yii, j) = + °jih3) ~ fihj) ^ 

13 

an integration to be equal in this case to yij, i). 

Now 

( 1 ) Bix{i I n) - a:(f + 11 n)) = ”Ci f “ i?”’-(.'c)(l - F(*))' dx, 

•L-oe 

as was proved by Irwin [ 2 ], By the symmetry here this integral is the same 
iii,n - i are interchanged, and since ^”(1 - Ff + i<*(l - F^ is a polynomial 
in F{1 - F) (as may be seen by putting F = § + G) the integrals (1) can be 
expressed in terms of the ^(i). TJsmg the fact that the expected value of the 
median is zero the E(x{i\n)) follow. 

The frequency function of a: = x{i\ n) is 

r- i)i(n - 

and so 

( 2 ) B{x{i\n)f = 

The joint frequency function of Xi = x(i ] n) and Xf = x(j \ n) is 
n I 

- F{x,)y-'\F{xd - F{x,)y-'-^r-\x,) 

(taking j > f), and to find E{xiX,) we multiply by x, x, and integrate, x, going 



TiO'R' MOMENTS 


281 


from -«3 to and a:, from a:, to co On expanding 
by the multinomial theorem a typical term is 

I- it (a;,) dx, dr.. 

TABLE 1 


Means and standard deviations 


Statistic 

Mean 

Standard 

Deviation 

Statistic 

Mean 

Standard 

Deviation 

xm 

.5641896 

.8256453 

*( 1 | 8 ) 

1.4236003 

.6106530 

a:(l|3) 

.8462844 

.7479754 

*( 2 | 8 ) 

8522249 

4892862 

a:(2j3) 

0 

.6698292 

*(3|8) 

.4728225 

.4480723 

i(l|4) 

1.0293754 

7012241 

*(418) 

.1525144 

.4326503 

xm 

.2970114 

.6003793 

*(1|9) 

1.4850132 

.5977903 

*(1|5) 

1.1629645 

.6689799 

*(2|9) 

.9322975 

.4750755 

a:(2j5) 

.4950190 

.5581388 

*(3|9) 

.5719708 

.4317205 

a:(3 5) 

0 

.5355685 

*(4|9) 

.2745259 

.4129877 

®(16) 

1.2672064 

.6449241 

*(619) 

0 

.4075553 

*(216) 

.6417550 

.5287511 

*( 1 | 10 ) 

1.5387527 

.6868083 

*(316) 

.2015468 

.4961981 

*( 210 ) 

1.0013571 

.4631674 

*(1|7) 

1.3621784 

.6260334 

*(3 10) 

.6560591 

.4183339 

*(2|7) 

.7573743 

.5066882 

*(410) 

.3757647 

.3974153 

*(3|7) 

*(417) 

.3527070 

0 

.4687447 

.4587449 

*( 6 | 10 ) 

.1226678 

.3886565 


We integrate by parts with respect to x, and then with respect to x, : the integral 
(3) is then seen to be 7(1 + r, n — j + s + 1), and 


(4) 


j—i—1 j—i—l-r 

^ (i - 1 ) 10 ' - r - l)!(n - j)! S S 


- i - 1)! 

rls!(j — t — 1 — r — s) 


y{i + r,n - j + s + 1 ). 


Using (1), (2) and (4), the values in Tables 1, 2, and 3 are obtained. The 
values are estimated to be correct, except for sample sizes 9 and 10, for which 
there may be errors of one or two units in the last place given. Missing values 
are filled in by considerations of symmetry. 


3. Exact values. All the integrals occurring for \p{i) or \p{i, j) can, by suitable 
transformations, and the integration of one variable over the range —00 to 00 , 

















282 


ir. J, GODWIN 


TABLE 2 

Variances and covariances 




























































LOW MOMENTS 


283 














































284 


H. J. GODWIN 


Now if Q is ax\ the integral is Wr/a (this is, in effect, stated by Jones). 
By elementary integration we have also that if Q = ax* + 2hxy + by^, the 
integral is 


TABLE 4 


Exact expected values 


.'r(l|4): 

VIT [(2/5)a 


+ (2/5)c] 



.t(2|4); 

Vir [(2/5)a 


- (6/5)c] 



x(l|5): 

VV [(l/3)a 


+c] 



,t(2|5): 

Vtt f(2/3)a 


-2c] 



x(3|5): 


0 




x(li5)=: 

1 

+h 


+d 


x(2|5)“: 

1 



-4d 


x(3 5)=i: 

1 

-2b 


+6d 


x(l|5).x(2|5): 


b 


+d 


x(l 5)x(3 5): 

2 a 

-2b 


-2d 


x(l|5)x(4|5) 

— 2a 




+3/ 

.x(l|5).r(5|5) 





-2/ 

x(2|5)x(3l5) 

— 2a 

+3b 


-d 

+/ 

x(25)x(4j5) 

4a 

-4b 


+M 

-if 

x(ll6)«: 

1 

+b 


+3d 


x(2|6)2: 

1 

+b 


-9d 


x{3|6)=*; 

1 

-2b 


+6d 


x(16)x(2|6) 


h 


+3d 



3a 

-2b 

+3c 

— &d 

-3/ 

.T;(l|6),'i:(4j6) 

— 3a 


-9c 


+9/ 

.x(l|6)x(5|6) 



12 c 


-6/ 

,T(l|6).r(6j6) 



-6c 



x(2|6)x(36) 

— 3o 

+4b 

-3c 


+3/ 

,x(2|6)x(4|6) 

9a 

-6b 

+9c 

“)~6 q! 

-15/ 

x(2|6)x(56) 

— 6a 


-18c 


+ 18/ 

x(3j6).x(4j6) 

— 6a 

+6b 


-Cd 

+6/ 


and ]f Q is ax* + by* + ca* + 2fyz + 2gzx + 2tia:y, the integral is 

Where A = abc + 2 fgh — af — bg^ — ch^. 

The author has not succeeded in obtaining similar results with a higher 
number of variables—it is possible that elementary functions no longer suffice 
then 






LOW MOMENTS 


285 


Using these results we can obtain exact expressions for ^^(1), i/'(2) and 4/(i, j) 
for 1 < i, j]i+j< 6, n'hich give, in addition to Jones’ results, the exact expected 
values in Table 4, wherein 

a = 15/4Tr 

b ■= 5\/3/47r 

c => (16/2/) arc sin (1/3) 

d = (5-\/3/2/) arc sin i 

/ = (15//) arc sin (l/-y/6) 

REFERENCES 

II] 0 Habtwgs, Jr., F Mosteller, J. W. Tukey and C. P, Winsob, "Low moments for 
small samples; a comparative study of order statistics,” Annals of Math. Slat., 
Vol, 18 (1947). 

[2] J. 0. Irwin, “The further theory of Francis Galton’s individual-difference problem,” 
Biomelrika, Vol, 17 (1926). 

13] H. L. Jones, “Exact lower moments of order statistics m small samples from a normal 
distribution,” Annals of Math Slat., Vol. 19 (1948). 


= 1.19366 20732, 
= .68916 11193, 
= .25824 50843, 
= .11085 93167, 
= .63913 55493. 



ON A THEOREM OF HSU AND ROBBINS 
By P. Erd5s 
Syracuse Universily 

Let /sCx), ■ ■ • be an infinite sequence of measurable functions defined 
on a measure space X with measuie m, m(X) = 1, all having the same distribu¬ 
tion function 0{t) = m(ar; fk{x) < t). In a recent paper Hsu and Robbins' 
prove the following theorem: Assume that 




. f tdGit) = 0, 
J-,ca 

CedGH) 

J—tO 


< CO. 


A-1 


Saw I > 11 ^, and put Mn = wi(S„). Then £ il/„ 


> c-n 


( 1 ) 

( 2 ) 

Denote ly Sn the set 
converges. 

n n 

It IS clear that the same holds if Z) fk{x) > n is replaced by Z Sk{.x) 

*-i *-i 

(replace AW by e-ji.{x)) 

It was conjectured that the conditions (1) and (2) arc necessary for the 
00 

convergence of 23 Dr. Chung pointed it out to me that in this form the 

n»l 

conjecture is inaccurate; to see this it suffices to put A(a;) “ ^(1 + r*(a:)) where 
rk{x) is the fcth Rademacher function. Clearly | A(a:) | < 1; thus Mn = 0, 

thus Z Mn converges, but / t dG{i) 9 ^ 0 . On the other hand we shall show 

n**! */—-go 

in the present note that the conjecture of Hsu and Robbins is essentially correct. 
In fact we prove 

^Theorem I. The necessary and sufficient condition for the convergence of 
Z Mn is that 


(!') 


[ (dG(ol<l, 

00 I 


and ( 2 ) should, hold, 

In proving the sufficiency of Theorem I, we can assume without loss of gener¬ 
ality that ( 1 ) holds. It suffices to replace AC*) by (fk(x) - C) where C = f idOit). 

•/—MS 

The following proof of the sufficiency of Theorem I (in other words essentially for 
the theoiem of Hsu and Robbins) is simpler and quite different from theirs. 
Put 

(3) a. = m{x-, I fk{x) 1 > 2 ‘), 

' Proc. Nal Acad. Sciences, 1947, pp. 26-31. 

286 



ON A THEOREM OF HSU AND ROBBINS 


287 


since the /t’s all have the same distribution, a,' clearly does not depend on k. 
We evidently have 

- u.-+i) < fedGit) < Z 2^'+=(a, - a,+0 < i:2=’+^a,. 

t"0 1*0 J—ae ,..0 

Thus (2) is equivalent to 


(4) 


Z 2 *’ a, < 00 . 
<-o 


Let 2’' < n < 2’+‘. Put 

g(l) ^ (j,. I ^ /„M . «.--2 

= (a:; 


= (XI 


fk(x) I > 2 ' for at least one k < n), 

fki(x) I > n'‘, I fki(x) \ > n'^, for at least two h < n, h < n), 

n I 


ZfUx) 


k^l 


> 2 -“), 


where the dash indicates that the k with | fk(x) | > are omitted. We 
evidently have 

c u sT U 

For if X is not in Sn^ U U Sn^, then clearly 


Z fk(x) 




< 2 ’-® + 2 '“’“ < n. 


00 

Thus to prove the convergence of Z it will suffice to show that 


n —1 


(5) Z (m(3n^) + m(Sn^) + m(»S5f’)) < »• 

From (3) we obtain thatm(jSn’) < n-a(_a < 2 *’*'^' 0 (_ 2 . Thus from (4) 

(6) Z = Z Z < Z 2^'-^’ a. < oo. 

<-0 !!<»<+> <-0 

From (4) we evidently have that for large u 

w(a;; |/*(a:) j > u) < 1/u. 

Thus since the /’s are independent and have the same distribution function it 
follows that for sufficiently large n, 

m()S^n^) < Z »»(*: |/*i(®) I > !/*»(*) I > 

l£ki<ki£n 

< m(x; l/i(a:)l > n'^),m(x-, 1/2(3;) | > n'^) < 

Hence 

(7) Z < w, ' 



288 


P. ERDOS 


/t(a:) for 1 /fc(a;) | < 

0 othenvise. 

Clearly the ftix) are independent and have the same distribution function 
G+il). Put 

(8) ftdG^it) = 6, Okix) - e. 



We have from (8) that f g^x) dm = 0, and by (1) that e -*■ 0 as n . We 
evidently have 

f dm =[ fftCx) dm + 6 f S (7*(a:)-piW dm. 

Jx \k-l j Jjt k-l Jri£A;<l£n 


Now since max | gk{x) \ < -H «, 


J gt(x) dm < + «)* • f^olM dm < , 


j glix)-g]{x) dm = dm dm < Ci. 


Thus 

^ (i 9kix)^ dm < Ca . 

Hence 

(9) m(x] ^ Qkix) > n/ie) < Cin~^^'’‘\ 

\ i-i 

Thus from (8), (9), |/jt'(a:) | < 1 gkix) \ + 1/16 (for « < 1/16) and n/8 < 2‘ * 
we have 

m^x] = m(^x-^ 'pijkix) j > 2’"*^ 

< I 2 gkix) j > n/16^ < CivT^'"^^, 


or 

( 10 ) m(Si^^) < Ci7r^'"^\ 

Thus finally from ( 6 ), (7) and (10) we obtain (5) and this completes the proof 
of the sufficiency of Theorem I. 



ON A THEOREM OF HSU AND ROBBINS 


289 


Next we prove the necessity of Theorem I, in other words we shall show that if 

CO 

2 Mn converges then (!') and (2) hold. 

First we prove (2). The following proof was suggested by Dr. Chung, who 
simplified my original proof. By a simple rearrangement we see that (2) is 
equivalent to 

(11) n f dG(i) < 00 

n-l •'|<|>iin 


for any c > 0; while 

( 12 ) 


f I < 1 dG{t) < 


00 


is equivalent to 

(13) ti f dG(i) < 00 

n-l 

for any c > 0. Now we have clearly, 

(a:; l/n(x) | > 2n) C S„_, U S„ . 


Hence 


23 / dG(t) ^ 23 (m{Sn-i) + m (S„)) < ». 

n I fl >2n n 

Thus we obtain (12). Since the terms of this series is non-increasing it follows 
that 


(14) 



Our assumption being that 2 M„ < «> we have Mn 0 as a —^ . 

that there is a constant p > 0 independent of k and n such that 


m 




^ p- 


It follows 


Now, writing set intersections as products, we have 


U (x; |/jk(x) I > 2n)- 




< n C iSn . 


u (B,n) cs„. 


*-1 


Writing this for a moment as 



290 


P. ERDbS 


where \fhix) | > 2n) cfcc. and denoting by R' the complement of R, 

we have 

Mn = m(iSn) ^ m (Rk-Tk)^ 

= m ( (Bi T^y • ■ • iBk-i Tk-0' Rk 
= E m((Bi Ti)' • ■ ■ (B*^i Ti_0' Rk Tk) 

ib-1 

S i m{R[ • • • B^-i Rk Tk) 

fc-1 

^ E [m{Rk’Tk) - m((Bi U •.. U } 

S E {ni(Tjt), - (fc - l)TO(Bi))m(Bi)) 

S ^ {p - nm(Bi)}m(Bt)^ E (p - (r(l))m(ft*). 

*-i fc-i 

^ p' E m{Rk) = np' f dGit) 

(fc-i 

by (14) since m(Bi) = / d(?(<), nmiRi) —^ 0 as n —> oo. 

Thus 


E w f dG(<) ^ ^ E -^n < “. 

n Ji(|> 2 b P n 

Hence we have (11), which is equivalent to (2). The proof of (!') is quite easy. 
By virtue of (2) we can put 

r m) = c. 

J—00 

If C > 1, then it follows from (2) and TschebychefE inequality that Mn —» 1 as 
n —> 00 , thus C g 1. But if C = 1, we conclude from. (2) and the central limit 
theorem that M„ does not tend to 0. Hence C < 1, and (1') is proved. 

By similar methods we can prove the following results: Let 2 < c < 4. Put 



Then the necessary and sufficient condition for the convergence of E -^n ^ 



ON A THEOREM OP HSU AND ROBBINS 


291 


is that 


f idGit)^0, f hrd(?(/)<«. 

J_oo tl—ee 


If c < 2 then the necessary and sufficient condition for the convergence of 
2 Mn^ is that f I < 1° dG(t) < <x. 

n"! ^ 

Finally we can prove the following result; Assume that / i dG{i) = 0 and 

J—Vi 

r" 

/ i* dG(i) < Then there exists a constant r so that 

J-oO 


(17) 5^ Wi fs; I £ fkix) I > ■ (log nY < «. 

n—l L 1 I 

The case of the Rademacher functions shows that (17) can not be improved 
very much, in fact only the value of r could be improved 



NOTES 

This section is devoted to brief research and expository articles on methology and 
other short items. 

BROWNIAN MOTION ON THE SURFACE OF THE 3-SPHERE 

By KAsaictt Yosida 

Mathematical Institute, Nagoya University 

1. Introduction. Let S be a n-dimensional compact riemann space with the 
metric ds^ = ^,,( 2 :) dx^dx’ such that the totality G of the isometric transformations 
of S onto S constitutes a Lie group transitive on S. Consider a temporally 
homogeneous Markoff process by which P{t, x, y), t > 0, is the transition prob¬ 
ability that a pomt x is transferred to y after the elapse of i-unit time. We 
assume that P(i, x, y) is a Baire function m (i, x, y) and continuous in t, then P 
satisfies Smoluchouski’s equation 

(1.1) P(t + s,x,y) = f Pit, X, z)Pis, z, y) dz (t, s > 0), 

dz being the Cr-invariant measure \/g(x)dx'^ dx^ • - • dx", gix) = det(pj,(a:)), and 

(1.2) Pil> X, y) ^ 0, 

(1.3) f Pit, X, y) dy = 1. 

•>3 

The spatial homogeneity of the transition process may be defined by 
(1 4) Pit, Tx, Ty) = Pit, X, y) for T e G. 

The "contmuity” of the transition process may be defined, following after A. 
Kolmogoroff and W. Feller,^ as follows. Let Li(»S) be the function space of 
integrable (with respect to dx) functions/(a:) on S, then, for those /(a:) which are 
dense in LiiS), 

^MA = A-fil,x), (i^O); 

(1.5) 

f{l, x) = f fiy)Pii, y, *) dy, it > 0), /(O, x) = fix), 

where, with non-negative b"ix) 

(1 6 ) (A/)(i) = ^ (- o'(x)/fa)) 

_ + ^ 

' A Kolmogoroff, “Zur Theone derstetigen zuf&lhgen Prozesse," Math. Annalen, Vol. 
108 (1933); W Feller, “Zur Theorie der stochastischen Prozesse,” il/afft Annalen, Yo\ 113 
(1937) 


292 



BROWNIAN MOTION 


293 


The temporally and spatially homogeneous “continuous” Markoff process 
may, if it exists, be called a Brownian motion on the homogeneous space S. 
The purpose of the present note is to ^ow that, under some denvability hypoth¬ 
esis concerning o (a:) and h ^(z), there exists one and (essentially) only one Broivn- 
ian motion on the surface of the 3-sphere )S^ 

I here express my hearty thanks to Dr. Kiyosi It6 who proposed to me the 
problem and discussed and much improved the manuscript. 


2 . The de finin g equation for the Brownian motion. The spatial homogeneity 
(1,4) is equivalent to the fact that A is commutative with every operator T de¬ 
fined by 

(2.1) TeG, 

because we have 

y> Tx) dy = jji.Tv)P[t, Ty, Tx) dTy = jj{Ty)P{i, y, x) dy. 
The condition (2.1) is equivalent to 


(2.2) XA = AX for any infinitesimal operator X = —j 

induced on S by the infinitesimal operator of the Lie group G. Thus, assuming 
the derivability of a'ix) and h'’(x) of necessary orders, we obtain from (2 2) the 
conditions: 


(2.3) 


(2 4) 


(2.5) 


(g‘(x) - 

^ ^ + (."w - m (^ H‘w), 

(H'ix) = O'ix) + (V^ 5” {x)), 

b%) ^ - {'fa) 


Now for the surface of the 3-sphere 5®, 
ds“ = de® + sin^e.dv®, 
and the infinitesimal operators 


e(0, <f>) = sin“0. 


-3^* - sin - -f- 


= cos — — 


cos 9 cos <p d 

sm 6 d<p ’ 

cos fl sin d 

sin 6 dip ’ 



294 


K69AKtJ YOSIDA 



respectively correspond to the rotations about the x-, y- and z-axis. 

From (2.5) we see that, by taking X == X,, 

( 2 . 6 ) <p) is independent of <p. 

By taking X = , in (2.4) we see that fl* is independent of ys. Hence, by 

(2.7) a\6, (p) is independent of <p. 

Thus, by taking fc = 1, Z = X* we obtain from (2.4), 

cos ^ - b“(9) sin ^ = sin ^ | H\6)) 

and thus 

(2.8) H^(e) = 0, h^\8) + ^ (4-^ H\e)) = 0. 

da \ 8 in a / 

Hence, by taking k = 2, X = X^ or X = Xy , we obtain from (2.4) 

~H\d) cos <p , 2^11 cos d 003 p , 2 h^‘>‘(Q) cos & cos p 

sin* e sin* 6 Bind ^ sin ® 

H\8) sin (fi _ C03 6 sin <o j_ cos y cos (? sin ip 

sm* $ sin* 8 ' sin* 0 ^ sin 0 

From these two equations we obtain 

(2.9) b^\e) = 0, ^4) _ 2b”(e) ~ + b’‘\e) — = o, 

sin* 6 sm'e sin e 

By taking i = 2, /c = 1, X = X, , we obtain from (2.6), (2.9) 

b^\e) cos „ + b^\e) = 0 

a0\ Bin ^ / 

and hence 

( 2 . 10 ) b^(e) = . 

sin»e 

Similarly by taking i = 1, fc = 1, X = X, we obtain from ( 2 . 6 ) 

b^{d) cos <p + b'*(9) cos <p = sin (p ^4^ 

do 

and hence by ( 2 . 9 ), ( 2 . 10 ) 


(2.11) = constant <7, 

Thus we obtain from (2.4) 


6“(e) 


C 

sin* 0' 



BHOWNIAK MOTION 


295 


ir^g) = -a'(fl) sin 0 + 2(7 eos 0, lf(d) = - sin 0-a (0) 
and thus, by (2.8), 

( 2 . 12 ) a\d) = 0 . 

Substituting (2.11) in (2.9) we obtain 


(2.13) a‘(0) = . 

sin 0 

Therefore since !)“(0) and b“(0) are non-negative, A is (essentially) equal to 
the Laplace operator 


(2.14) 


sin 0 00 ^ 30 ’’’ Bi+ 0 dip^' 


Thus we may obtain P(i, x, y) by integrating the equation 


(2.15) A-/(f;0,y), (t^O), 

and by putting 

(2.16) /((; 0, v) = /(t, x) = /* f(y)Pit, y, x) dy. 

JS* 


3. Integration of the equation (2.16)-(a.l6). Consider the Laplacian (real) 
spherical harmonics 

(3.1) Tr’(0, = rr’(®), (-fc g m g fc; fc = 0, 1, •. •). 

They constitute an orthonormal function system complete for continuous 
functions on S*, and we have 

(3.2) A■ Ti"’(0, p) = -kik + 1)(0, ^). 

Since, as is well-known, 

(3.3) Yr\T-^x) = i: uf;>iT)Yjr\x) 

n^»—k 

by an irreducible orthogonal representation (wnm(T)) of the rotation group G, 
we have 

(3.4) max | Yi’^^x) |“ ^ (2fc + 1) min E 1 Fj^^^pi) f, 

a a 

by applying the Schwarz inequality and the transitivity of the group G on 
AS^ The right hand member satisfies, by the orthonorraality 

(3.6) (2/i; + l)’“/(area of S^). 

Therefore the double series (for t > 0) 



296 


ARTEH DVORETZKY 


(3.6) Pit-,e, ^-d', ^0 = Z E exp (-m + i)OFr’(0, <p)Yi”'\e', ^0 

A“0 

is absolutely and uniformly convergent on S®. We will show that this P is 
the required (unique) Brownian motion on . 

The proof may be given in three steps, i) We see by (3.2) and (3.6), that 

f fiy)Pi^j 2/i satisfies (2.15) if 
Ja, 

fix) - E z dl”'^Yt\x), E E exp i-kik + i)t)kik + 1) dn’^^x) 

ft«»0 m^—k A“0 w“—1: 

are both absolutely and uniformly convergent. By the completeness of { (x )}, 
Buch/(x) are dense in Z/i(S). 

li) Because of (3.3) we see that (3.6) satisfies the spacial homogeneity (1.4). 
iii) (1.3) is obvious by the orthonormality of {F|["'*(x)} and the constancy 
on of F®(x). Next, for the solution fii, x) of (2.15)-(2.16), let fix) = 
/(O, x) be non-negative on S^, then Qtit, x) = exp(— e l)fii, x), (« > 0), satisfies 

= A- 0 .(f, x) - tg.it, x), (t > 0 ), 

g.( 0 , x) = fix) ^ 0 (on (S’). 

Thus g,it, x) ^ 0 on (S’, since ff<(t, x) cannot have a negative minimum on the 
product space [h , ii] X (S’, for any <* > fi > 0. For at such minimizing point 
we must have 







S 0. 


Therefore, since e > 0 , <5 > h > 0 were arbitrary, we conclude that fit, x) ^ 
0 on (S’ for f > 0 if fix) = 0 on (S’. This proves (1.2). The same argument 
simultaneously shows us that the solution P of (2.15)-(2 16) and (1.2)-(1.3) is 
unique. 


ON THE STRONG STABILITY OF A SEQUENCE OF EVENTS 
By Arybh Dvoretzicy 

Hebrew University, Jerusalem, and Institute for Advanced Study 

1. Summary. M. Lofeve [3] has found conditions under which a sequence of 
events which may be interdependent in an arbitrary manner is strongly stable. 
In this note it is estabhshed that considerably weaker conditions imply the 
strong stability. 


Ai I A2 f ’'" , A„ , 


2 . Introduction. Let 

( 1 ) 



STRONG stability OP A SEQUENCE 


297 


be a sequence of events, which, may depend on each other in any way whatsoever, 
defined on the same set of trials, 

Let Rn be the repetition function of (1), i.e. R„ is the number of those among the 
first n events: At, , ■ • • , which were realized, and put/n = R^/n. The 

random variable/„ is called the frequency function of (1). 

Denoting by = S the expected value of x it is evident that 


= EPr (Al.), /„ = E{U] = 


Pollowing Lohve [3, p. 252] we say that (1) is strongly stable if the sequence 
<pn = fn — fn (n = 1,2, ■ • ■) is strongly stable in the usual Kolmogoroff sense 
[1, p. 58], i.e. if 

( 2 ) 


lim Pr (sup I ipK I > e) = 0 

n-*« »'>» 


for every « > 0. 
Putting^ 


i i Pr (Ad. 7. = 


'fl 71(71 1 ) 

and introducing the abbreviation’ 

™ 7n fin . 


Y, Pr (A^Av) 


Lo^ve's result [3, pp. 257-9] is the following: 

If n8n is hounded then (1) is strongly stable. 

This, even when specialized to sequences of independent events, includes the 
Bernoulli and Poisson cases. 

Here the following stronger result will be established. 

Theorem. If 2 5„/n is convergent then (1) is strongly stable. 

In particular, if for some e > 0 the sequence n'Sr is bounded then (1) is strongly 
stable. 


3. A lemma. The new tool here used is the following simple result on series of 
positive terms. 

Lemma. Let a„ > 0 forn = 1,2, and 

.5 ¥ 

be convergent. Then there exists a sequence n* of integers satisfying 
(4) 0 < Ui+i — nt = o(ni) {i 

and such that the series am is convergent. 


' A^Ay denotes the event; both A., and A,. 

’ Our , 7 n and 5„ correspond to Lo6ve’s pi(n), pi(.n) and dj respectively. 



298 


AEYEU DVOHETZKY 


Phoof. Since (3) is convergent it is well known® that there exists a sequence 
of numbers l„{n = 1, 2, • • •) satisfying 


( 5 ) 

having the property that 


Zn+i > h , lim = a= 


( 6 ) 


CO. 

n-l n 


We define inductively a sequence of integers m(i) through 


( 7 ) 


mil) = 1, mii + 1) = m{i) + 1 + , 

J 


the square brackets denoting the integral part. Clearly 
(8) 0 < wi (f + 1) — jn(t) = o(m(i)). 

Now for every i we choose m so that 


m{i) < Ui < mii + 1) and a„,. = min a ,. 

T|r*(i)^K<m(<+l) 


These nt satisfy the requirements of the lemma. 

Indeed, (4) holds in virtue of (8) while applying (5) and (7) we obtain 




= Z. Ip- > imii + 1) — mii))l„(i) 




mii) 




mii + 1 ) mii + 1 ) 


*n( . 


Since 2 8< converges by (6) it follows from the preceding inequality and (8) that 
S a„, < 00 as required. 

Corollary. Thej^clusion of the lemma remains valid if the condition a„ > 0 
is dropped provided (3) is absolutely convergent. 


4. Proof of the theorem. An easy calculation [3, p. 253] gives 

V® = E[if„ - f„y] = S„ + . 

n 

Since both and y„ are between zero and one we have 

-~ < < - . 
n n 

Therefore it follows from the assumption of the theorem that 2 ial/n) is con¬ 
vergent. Hence by the lemma there exists a sequence of integers ni satisfying 
(4) and such that 2 (r®, converges. 


» Take e.g. = (2,>„ (cf [ 2 , p. 299]). 



STRONG STABILITY OP A SEQUENCE 


299 


Applying Tchebytcheff's inequality to and adding for v > i 

we have for every e > 0 

(9) Pr(sup|¥>n,| > e) 

lint <n < nt+i then 

^ ^<+1 ~ ni 

n n, Ui ‘ 

Denoting the last term of this inequality by £,• and putting = max.^, e, , we 
have from (9) 

Pr (sup I ipn 1 > e + 2«<) < ^ Z <^n, . 

fiini £ >•.! 

As —> 0 and the right hand term is the remainder of a convergent series, (2) 

follows and the theorem is proved. 

6 , Remarks. 1. The lemma used here can also be applied to the study of 
the order of magnitude of <pn in the almost certain sense. 

2 . If the terms of (3) are decreasing then the existence of a convergent sub- 
series of 2 a„ satisfying (4) implies 2“-i 02 < < w. But this is equivalent to the 
convergence of the series with monotone terms (3) (cf. e.g. [2, p. 130]). Hence 
in this case the convergence of (3) is necessary as well as sufficient for the validity 
of the lemma. It may be possible to use this remark in order to establish in 
some special cases, where the interdependence of the variables decreases steadily 
in a suitable sense, necessary and sufficient conditions for strong stability. 

3. The sequence of 5„ is of course, of very specialized structure. Thus, since 

the stability of (1) is equivalent [3, p. 255] to 5„ 0 and is implied by strong 

stability, it follows that 5n —> 0 whenever 2 (5„/n) is convergent. 

Added in proof: Since this paper was submitted I heard from Professor M. 
Lobve that he has independently obtained the theorem of section 2 by another 
method. 

REFERENCES 

[1] K. KoLMoaoROFP, Orundbegnffe der Wahr8cheinUchkeitrechnnng,'EiTgeh. d. Math Vol. 2, 

no. 3, Springer, Berlin, 1933, 

[2] K. Knopp, Theory and Applications of Infimle Series, Blaokie, London and Glasgow, 

1928. 

[3] M. LoAvb, “Etude asymptotique dee eommee de variables aldatoires lides,” Jour, de 

Math, puree et appl, Vol, 24 (1946), pp, 249-318, 


I /n ~ /»< I = 



300 


K. S. BANEHJBB 


A NOTE ON WEIGHING DESIGN 
Bt K. S. Banerjbe 
Pusa, Bihar, India 

1. Efficiency of weightag designs given by a thxee-fourth replicate. In the 

June issue of the Annals, Kempthome [1] approached the construction of the 
orthogonal matrix X through fractional replicates, the original treatment of 
which was given by Fioney [2]. Reference has been made to the use of a three 
fourth replicate for weighing designs. Details for such designs have not been 
furnished as their efficiency is lower than for the designs given by the com¬ 
pletely orthogonal matrix X In a three fourth replicate the treatment combina¬ 
tions have to be chosen in a particular manner for a comparatively easier 
analytical treatment both from the pomt of view of agrobiological experiments 
as well as weighing designs. The variance of each of the estimates in such a case 
will be (rV2"“^. As a matter of fact, in a weighing design given by a fractional 
replicate of the type of (2^ - l)/2^ (/S = 1, 2, • • • n), of 2” experiments, the 
estimate of the variance of each object is independent of the fraction used and 
is equal to the same as above. 

2. Construction, of a three fourth replicate. Kempthome mentions that a 
factorial design of fraction J could be taken to consist of a ^ replicate on the 
identity I — ABC and a quarter replicate based on the identity 

7 = A = BC = ABC. 

If the half replicate based on the identity I = ABC be taken to consist of aU the 
treatments corresponding to the minus signs of the treatment contrast ABC [3], 
the additional quarter replicate can be chosen in two different ways. When 
however the treatments corresponding to the minus signs of both A and BC 
are kept, omitting the treatments corresponding to the plus signs of A and BC, 
the three fourth replicate so obtained wdl have certain advantages, which will 
not be available if the quarter replicate to be added is chosen to consist of the 
treatments corresponding to the plus signs of A and BC. 

3. Behavior of the contrasts in a three fourth replicate and the efficiency of 
the weighing designs. In general, if there are n treatments giving rise to 2" 
treatment combinations and if the defining contrasts be chosen as 

I = ACD ■= BDE = ABCE, 

it will be necessary to omit the treatment combinations corresponding to the 
plus signs of both ACD and BDE, which will be 2"“* in number. In the three 
fourth replicate so obtained, 2" treatment effects (inclusive of the mean) will 
divide themselves into sets of 4 treatment contrasts each. One of the sets will 
be I, ACD, BDE and ABCE and any other set will be formed by multiplying 
any treatment contrast by the defining set namely, I, ACD, BDE and ABCE. 
Only three contrasts out of four in a set will be independent, so that only one of 



WEIGHING DESIGN 


301 


the contrasts, preferably the one of the highest order interaction may be kept 
as an alias (in agrobiological experiments) of the remainmg three and may 
therefore be omitted. Each of the four contrasts within a set will be orthogonal 
to each of the other contrasts in the remaining sets, but within a set the four 
contrasts will be non-orthogonal to one another. Though non-orthogonal, the 
normal equations will be of the systematic type‘ and the matrix X'X, taking 
any three contrasts out of each set of four, will take the following form: 


"x 

a 

a 

0 

0 

0 

0 

0 

0 . . .“ 

a 

X 

a 

0 

0 

0 

0 

0 

0 • • • 

a 

a 

X 

0 

0 

0 

0 

0 

0 • ■ • 

0 

0 

0 

X 

a 

a 

0 

0 

0 ■ ■ ■ 

0 

0 

0 

a 

X 

a 

0 

0 

0 • • ■ 

0 

0 

0 

a 

a 

X 

0 

0 

0 . . . 

0 

0 

0 

0 

0 

0 

X 

a 

a • ■ ■ 

0 

0 

0 

0 

0 

0 

a 

X 

a • • ' 

0 

0 

0 

0 

0 

0 

a 

a 

X • • * 


- 

• 

• 

• 

• 

• 

• 

• • • ■_ 


where the order of the matrix N — f2" is of the form 3t(i = 2"~*) and x = 3. 2" 
a = —12" + ^2"“' = — 2"~l The value of the above determinant = (x — af‘ 
(x + 2o)* and that of the determinant suppressing the first row and the first 
column = (x — (x + a)(a: + 2a)‘“\ o’* = (x + a)/{x — o)(x 4- 2a) = 
1/2"“*, substituting for x and a. The variance of each estimate will therefore 
be <rV2’'"\ 

4. General case. When a fraction of the type a/^ = (2^ — l)/2^ is used, 
the treatment combinations corresponding to the plus signs of the /3 independent 
contrasts is omitted. Out of each set of 2^ treatment contrasts, only a = 2^ — 1 
will de independent and the matrix will then take a form like that of (1), where 

X = [(2^ - l)2"]/2^ = 2"~^(2^ - 1) and 
a = -^ 2" + [(2^“' - l)2’’]/2^ = -2'•"^ 

a" = [X + (a - 2)0]/ (X - o) [x + (« - 1)0] = (2-2"-'’)/2"2’’-'’ = l/2"-‘. 

The variance of each estimate = <rV2"~^, the same as before. When a com¬ 
pletely orthogonalised matrix of the order (a2")/2^ = 2" ^(2^ — 1) is available, 
the variance of an estimate will be o' 12" ^(2^ — 1). The ratio of the two 
variances = 2'‘~V(2" — 2""'’) = 2®“V(2* — 1)) which shows how the efficiency 
of the weighing design decreases with the increasing value of the fraction. 
When 0=1, i.e. in a half replicate, the efficiency is 100 percent. The value of 
the fraction is never less than 

1 The analysia of the data available from agrobiological experiments will not be cumber¬ 
some to a prohibitive extent as in many other experiments where non-orthogonality creeps 
in The results of investigation in this direction have already been communicated for 
publication elsewhere. 







302 


E. S. 


6 . Independence of the estimates given by Ly in a biased spring balance. 
Kempthome mentions that although the optimum designs for the spring balance 
case suggested by Mood furnish somewhat smaller variance than what is given 
by fractional replicates, these designs have the disadvantage that the estimates 
are correlated, whereas the estimates furnished by fractional replicates are 
orthogoiml. The designs furnished by fractional replicates take account of the 
bias and if the weighing operation corresponding to the bias is omitted (in case 
where the spring balance is free from bias), the resultant scheme will fail to give 
independent estimates and the variance factors will be of the same magnitude 
as in the optimum design Ln of Mood with the same number of weighings. 
Again, these optimum designs may also be made to furnish independent estimates 
when the designs are adjusted in the manner as suggested by Mood to suit a 
biased spring balance. 

It is true that the design matrix L, given by 


X - 


'1 1 0 ' 
1 0 1 
0 1 1 


does not give independent estimates as such; but when it is assumed that the 
spring balance has a bias and the design matrix is modified as follows: 


( 2 ) 


X - 


10 0 0 
1110 
110 1 ’ 
10 11 


the estimates except that for the bias will be orthogonal to one another and the 
variance of the estimated weights wiU necessarily be larger in value. 

Before proving the general case, we notice that when — 1 is substituted for 0 
in (2) above, the resultant scheme will be an orthogonalised matrix. This is 
true not only in this particular instance but will hold good also in general. The 
constitution wiU be clear when the method of construction of Lh from Hk+i 
is recalled. 

The distribution of ones in Ly gives a special typo of symmetrical balanced 
incomplete block design, where r = fc = ^(6 + 1) and \ - i(6 + 1), while the 
distribution of zeros gives the complementary design for which fo ■» r - 1, 
ko “ fc — 1 and Xo “ X — 1. Therefore when a row of zeros and a column of 
ones (in that order) is added to Ly , the matrix X'X of the resultant scheme takes 
the following form: 


(3) 


JV +1 r r r < ■. f 

r r X X X 

r X r X ••• X 


L ^ X X X • • • 


r 





WEIGHING DESIGN 


303 


use of the identities well known in the theoiy of bnlsnced incomplete 
block designs and remembering the relationships, 2X = r = fc = ^(iV+l), 

(I) The value of the determinant of 

X'X - (r - X)^“'[(iV + 1) {r + \{N - 1)} - r^iV] = (r - X)^-Hr + \{N - 1)1, 

(II) The value of the determinant suppressing the first row and the first 
column ^ (r - X)"“^[r + X(iV - 1)], 

(III) The value suppressing the second row and the second column 

= (r - X)^-*[(1\I + l)(r + \{N - 2)1 - t\N - 1)’ 

= (r - + \(N - 1)1, 

(IV) The value suppressing the first row and the third column 

= (r - X)''“'[r{r + X(2^ - 2)) - r\{N - 1)1 
= r(r - X)^-\ 

(V) The value suppressing the second row and third column 
= (r- X)"-*[X(JV + 1) - r“l 
= 0 . 


Hence, the reciprocal matrix of X'X will be given by 


( 4 ) 


[z'xr' = 


" 1 

-1/A: 

-1/A: 

-l//c‘ 

-1/A: 

2/k 

0 

0 

-1/A: 

0 

2/k ••• 

0 


L-ia 

0 

0 

2/K 


Let Y' denote the column matrix of the results of the weighings, Va,yi, ••• ,y/r 
and B' the column matrix of the estimates of the weights ho, hj, ■ • • hw . Then 
the estimates will be given by the equation 

B' = [X'XY^X'Y'. 

It is easy to see that all the rows except the first in [X'Xy^X' are orthogonal to 
one another. To explain this, let us take the design given by (2). Here 

“ 1111 "! 
y, 0 110 

“ 0 10 1 ' 

.0 0 11 . 

Then [:x:'Zl"‘Z' wiU be of the form 

" 1 0 0 0 " 

-i/k +1A +1A -lA 
-i/fc +i/fc -lA +1A ■ 

_-iA -lA +1A +1A_ 

In aU the rows excepting the first, for every 0 and +1 in X', there will re¬ 
spectively be a —1/A: and a -fl/fc in [X'X]~^X'. It has been mentioned before 








304 


K. B. BAN'ERJSB 


that an orthogonal matrix is obtained when —1 is substituted for every 0 in Z 
or X'. Hence, N rows (all except the first) of will be orthogonal 

and these N rows will estimate the N weights in orthogonal linear combinations 
of 2 / 0 , J/i ■ ■ ■ 2 /v ■ 

It has been mentioned before that the distribution of zeros in Lh gives the 
complementary design, for which ro^r—l, Ko=afc— 1 and Xo “ X — 1. 
If to such a design, a row of ones and a column of ones (in that order) bo added 
to suit the estimation of the weights in a biased spring balance, exactly a similar 
situation will be obtained and the estimates will be orthogonal. It can readily 
be seen that the design furnished by Yates to weigh seven light objects and a 
bias is an illustration of this kind. The scheme given by Yates is the comple¬ 
mentary design of In with an additional row and a column of ones added to In . 

The sixteen combinations of ten objects, a, b, c, d, e, f, g, h, k, I include 1, 
which corresponds to weighing with empty pans or, in other words, which is 
devoted to estimating the bias. When 1 is omitted, X'X will be of the form 


r 

X 

X 


X X 
r X 
X r 


X 

X 

X 


I 


XXX r 


where r = 8 and X = 4. The above matrix X'X is obviously of the same form 
as given by Lu , 

By following exactly the same procedure as given above, it can easily be seen 
that when the weighing operation 1 is included in the weighing design, the 
solution of the normal equations will lead to independent estimates. The 
absence of each letter wll be a 0 and the presence o + 1 in the design matrix 
and if — 1 is substituted for every zero, the resultant matrix will be orthogonal. 
In some cases, however, the number of letters in all the combinations will not be 
the same, i.e. k will not be constant. In such a situation, k in (4) will take the 
value of r or of 2X. 


REFERENCES 

[1] 0. Kbmpthornb, "The factorial approach to the woighiag problem," Annuls of Mulh. 

Stal., Vol. 19 (1948), pp. 288-246. 

[2] D. J. Finney, "The fractional replication of factorial arrangements," Annals of 

Eugenics, Vol. 12 (1945), pp. 291-801. 

13] F. Yates, Tech, Commun. Bur. Soil Sci. Harpenden no. 36 (1937), p. 11. ' 

[4] Harold Hotbllinq, "Some improvements in weighing and other experlmenl^' * 
niques,” Annals of Math. Slat., Vol, 13 (1944), pp. 297-306 b 

[6] K, Kishbn, "On the design of experiments for weighing," Annals of Math. SlaiQ} 

(1945), pp. 294-301. ^ ^ ; 

[6] A. M. Mood, “On Hotelling’s weighing problem," Annals of Math Slat.,Yo\. 17 i’i‘ ' _ 

pp. 432-446. 

[7] R L. Plaokbtt and J. P. Bijhman, "The design of optimum multifactorial expddipent,” 

Biometriha, Vol. 33 (1946), pp. 305-326. 








CONTROL CHART 


305 


CONTROL CHART FOR LARGEST AND SMALLEST VALUES 
By John M. Howell 
Los Angeles City College 

1. Introduction. It may at times be desirable to use a control chart for 
largest and smallest values (L & S) in place of the conventional charts for 
averages and ranges (X & R). The chart for largest and smallest values has 
certain advantages: all information may be combined on one chart, computations 
are simple, and specifications may be placed on the chart. In this paper, 
constants for the use of this chart are developed and comparison is made with 
the average and range charts. 

2. Constants for determining limits. Let L and S denote the largest and 
smallest values, respectively, m a sample of n pieces, and let L and B denote the 
averages of these values for h samples. Then (L + S)/2 and (E — B)Jdi are 
unbiased estimates of the population mean and standard deviation, respectively, 
in the case of a random sample from a normal population. The value of the 
constant do. is given in [1] and repeated in table 1 for convenience. If we denote 
(L + 3)/2 by M and (L — B) by R, control limits may be determined in terms of 
these statistics. 

In conformance with usual control chart practice, we will set the upper control 
limit at L + 3 Jl and the lower control limit at S — 3ffs, where h is an estimate 
of the standard deviation of the largest values in samples drawn from a normal 
population, and similarly for ^a. The results of Tippett [2] and Pearson [3] 
for E(R) of samples from a normal population were used to determine expected 
values of L and 3: R(R) = djcr. Here, R is the range of samples of size n: 
R = L — 3. But since E[(,L + iS)/2] = a for a symmetrical distribution, then 
E{L) = a + doal2 and E{3) = a ~ doa/2, where a and u are the mean and 
standard deviation of the normal population from which samples are drawn. 

The probability element of the largest value [4] is given by: 

n[FiL)T-^f(L) dL where f(,x) = 1/V^ and F{x) = f* f{y) dy. 

J—ao 

Then E{Jl) = n [ L\F(,L)]’'~^f{L) dL. Integrals of this type, differing only 

by a constant factor have been evaluated by Hojo [5] and from his results di was 
determined so that cl = cb = diir. Values for dt for n = 2, 6,10 are also given 
by T'l^Tjett [2]. “Three-sigma” control limits may then be given in the form: 

^ zR, where Ai = 0.5 -f- Sdi/dj. The expected value of the upper control 
up, ( . 11 then be: E(UCL) = a -f- At<r, where At = (di/2) -f- 3dt . Values of 
'ustants for various sample sizes are given in Table I. 
practice, it might be desired, in the case of control charts for individual 
mv ements or for L and 3, to have E{UCL) = a -f- 3<r, and tlie lower control 
limit symmetrically placed with respect to the central line_ In this case, the 
formula for the limits would be: Af ± 3^/dj or M ±: ■s/nAoR, where Ao = 



306 


JOHN M. IIOfl'EHIi 


Z/idis/v) is given in [1]. Since the efficiency of M decreases rapidly with 
increasing sample size [6], it would probably be better to use R in place of 
M for determining the central line for a control chart when the sample size is 
greater than five. 2 is the "average of averages" as defined in [1]. 

The chart for largest and smallest values would then consist of a chart on 
which both the largest and smallest values are plotted, ivith the central line at Af, 
and the limits as given above. 

3. Comparison of charts for a particular case. A comparison of the L & S 
chart with the X chart for a particular case in which the sample size was three is 
given in Fig. 1. Measurements were the shear strength of spotweld coupons of 


TABLE I 

ConsLanls for largest and smallest value chart 


n 

d. 

di 

A, j 

A, 

A, 

rt 

2 

1.128 

.825 

1.880 

2.72 

3.03 

2 

3 

1.693 

.748 

1,023 ! 

1.82 

3.09 

3 

4 

2.059 

.709 

.729 

1 

1.53 

3.15 

4 

6 

2.320 

.070 

.677 

1.36 

3.17 

6 

6 

2.534 

.648 

.483 

1.27 

3.21 

6 

7 

2.704 

.627 

.419 

1.20 

3.23 

7 

8 

2.847 

.614 

.373 

1.15 

3.20 

8 

9 

2.970 

.600 

.337 

1.10 

3.28 

9 

10 

3.076 

.688 

.308 

1.07 

3.30 

10 


aluminum in pomds. Since the range chart had no points above the “three- 
sigma” control limit and showed no other peculiarities, it has been omitted. 


4. General comparison of charts. We assume a mean of zero and a standard 
deviation of unity as a "given standard," and then compute the probabilities 
when the true values are a and ir. The probability of a point being inside of 
3-sigma control limits on the range chart under these oonditions is: 
Pi = Pr(B < diDi/a), where Z »4 is given in [1], The probabilities for the 
range iMed here were found from the Pearson-Hartley tables [3], The usual 
normality assumptions are made. 

The probability of a point being inside of “3-sigma" control limits on the 
average chart under the same conditions is: 




((s/v^)—a) 


(pit) di where <p{t) = 


-<>/! 


f /—I rr i IS \ Yvuolo (p\ll = 7^ 

Since Daly [7] has shown that the average and range of samples from a normal 







SHEAR STRENGTH OF SPOTWELD COUPON IN POUNDS 


CONTROL CHART 


307 


CHART FOR LARGEST AND SMALLEST VALUES 








308 


JOHN 3V£. HOWELL 


TABLE II 


n 

a 

(T 

P . 

Pi 

PiPi 

Pi 

ffr 

Ni 

3 

0 

1.0 

.994 

.997 

H 

.991 

510 

510 



1.2 

.973 

.988 


.963 

116 

122 



1.5 

,901 

.955 


.808 

31 

33 



2.0 

.721 

.866 


.645 

10 

11 

3 

0.5 

1.0 

.994 

.983 

.977 

.980 

198 

228 



1.2 

.973 

.935 

.935 

.939 

69 

74 



1.5 

.901 

.917 

.826 

.834 

25 

27 



2.0 

.721 

.830 

.598 

.694 

9 

13 

3 

1.0 

1.0 

.994 

.898 

.893 

.931 

41 

05 



1.2 

.973 

.855 

.832 

.860 

25 

31 



1.5 

.901 

.802 

.723 

.740 

15 

17 



2.0 

.721 

.746 

.538 

.550 

8 

8 

3 

2.0 

1.0 

994 

.323 

.321 

.590 

5 

9 



1.2 

.973 

.352 

.342 

.510 

5 

7 



1.5 

.901 

.378 

.341 

.414 

6 

0 



2.0 

.721 

.408 

.294 

.321 

4 

5 

6 

0 

1.0 

.995 

.997 

.992 

.992 

570 

570 



1.2 

.969 

.988 

.957 

.957 

105 

105 



1.5 

.855 

.955 

.817 

.878 

23 

36 



2.0 

.588 

.866 

.509 

.545 

7 

8 

6 

0.5 

1.0 

.995 

.970 

.965 

.980 

130 

227 



1.2 

.969 

.942 

913 

.927 

51 

62 



1.5 

.855 

.891 

.762 

.791 

17 

20 



2.0 

.588 

.805 

.473 

.505 


7 

5 

1.0 

1.0 

.995 

.776 

.722 

.923 

II 

68 



1.2 

.069 

.736 

.713 

.828 

mm 

25 



1.5 

.855 

.695 

.594 

.661 

9 

12 



2.0 

.588 

.648 

.381 

.426 

5 

0 

5 

2.0 

1.0 

.995 

.071 

.071 

.512 

2 

7 



1.2 

.969 

.110 

107 

.402 

3 

6 



1.5 

.855 

.164 

,140 

.286 

3 

4 



2.0 

.588 

.230 

.135 

.185 

3 

3 









SUFFICIENCY, TRUNCATION AND SELECTION 


309 


ijopulation are independent, the probability that a sample is within control 
limits on both charts is the product of the probabilities; Pip 2 - Thus the 
arobability that a sample be outside of control limits on either chart is 1 — P 1 P 2 . 

The probability of the largest and smallest values both lying in the interval 

r ~ 

from —c to c is: Pa = Pr(—c < S, L < c) = 


!p{t) dt 


. Values of 
for sample of 


[_] (~c~a)/r 

this expression with lower limit — « are given in table XXI of [8 
sizes 3, 5, and 10. For the purpose of comparing the charts, we choose c so that 
the probabilities of Type 1 errors are equal, that is: 1 — PiPa = 1 —PiorPiPi = Pt 
when the mean is zero and the standard deviation unity. Substituting in this 
equation and solving, we find: P(c) = 0.5 + 0.5 (.9973Pi)^^", where F(x) = 


[ <p{i) dt. For n = 3, c = 2.99 and for n = 5, c = 3.15. 

J^tC 

Comparing PiPj with P| when the true values are a and e will then show the 
relative power of the X & P charts and the L&S chart for detecting lack of 
■control. 

Finally the charts are compared by finding the number (Ni for the X & R 
charts and iV» for the L & 5 chart) of samples which \vill detect lack of control 
with a .99 probability under the conditions given above. This is done by 
finding the smallest integer which satisfies the foUowing inequalities: (PiPc) ' < 
.01 and Pj * < .01. As may be seen from table II, under most conditions, the 
L&S chart is nearly as good as the X & R charts for detecting lack of control. 


refbuencbs 

11] Ambbioan Standards Association, Control Chari Method of Conlrolhng Quality during 

Production, 01.8—1943. 

12] L. H. C, Tippett, "On the extreme individuals and the range of samples taken from a 

normal population," Biomelrtka, Vol. 18 (1926), pp. 364r-387 

[3] E S. Pearson, “The probability integral of the range in samples of n observations 

from a normal population," Biometrika, Vol. 32 (1942), pp. 301-308. 

[4] S. S Wilks, Mathematical Slatialica, Princeton University Press, 1943, p. 91. 

16] Hojo, “Distribution of median from a normal population," Biometrilca, Vol. 23 (1931), 

p. 315 

[6] W. A. Shbwhart, Economic Control of Quality of Manufactured Product, D Van Nos¬ 
trand Oo.| 1981; p. 282. 

17) J, P. Dalt, “On the use of the sample range in an analogue of Student’s l-test. Annals 

of Math. Stat., Vol. 17 (1949), pp. 71-74. ^ . 

[8] Karl Pearson, Tables for Stalialiciana and Biometncians, Cambridge University 

Press, 1914. 


SUFFICIENCY, TRUNCATION AND SELECTION^ 

' By John W. Tukey 

Princeton University 

1. Summary. The fact that the mean and variance were sufficient statistics 
for a univariate normal distribution truncated at a fixed point was known to 

I Prepared in connection with work sponsored by the Office of Naval Research. 



310 


JOHN -W. TUKEY 


Fisher by 1931 [2]. Hotelling [3] has recently observed the corresponding fact 
for the truncated multivariate normal distribution. 

It is the aim of this note to point out that these are special cases of a general 
result, namely; If a family of distributions admits a set of sufficient statistics, then 
the family obtained by truncation to a fixed set, or by fused selection, also admits the 
SAME set of sufikient statistics. 

2. Representation. The basic formal results about sets of sufficient statistics 
are due to Fisher [1], whose arguments, with obvious modifications, establish 
that families of distributions satisfying the usual conditions have sufficient 
statistics. The converse was established by Koopman [4] for a reasonably wide 
class of families. 

The usual condition can be easily handled and given wide application by 
representing the family of distributions in a form suggested to the author by 
Rubin, and ascribed by him to Cram4r, namely: 

dF(x I 6) = c(d)f(x 1 8) dy(x), 

where a: is a possibly multidimensional chance quantity (i.e. random variable), 
9 is a possibly multidimensional parameter, c(9) is a positive real function of 8 
which serves to normalize the distribution, f(x | 8) —the relative probability 
density—is a non-negative real function of x and 8, and ii(x) is a positive measure 
function. In this representation the natural and sufficient condition that 
(Ai(a:)l are a set of sufficient statistics for 8 is the existence of functions 0,(9) 
such that (of. Koopman [4]) 

( 1 ) 

When 9 is a vector, the derivative is to be interpreted as the gradient (a vector) 
and the o<(9) are to be vector-valued functions of 9. We notice that this condi¬ 
tion concerns only the relative density function. 

3. Proof of result. Suppose the family F(x \ 6) is truncated onto a Borel set 
E, this means that 

pr (,fa s, Ip(,I«)tacfad to E] . r»>ingnairfe|»j), 

’ Pr {a: m .B |F(a! I 9)} 

If <t>ii(.x) is the characteristic function of E, which is =1 for x in e and =0 
otherwise, and if 

/c(9) = Pr{a: in E | F(x | 9)} = f dF{x | 9), 

then the probability element of F(x | 9) truncated to E is 

cie)/k{8)f(x I d)4>g{x) dfi(x) = c'{8)f{x I 9) dv{x). 



ON A PROBABILITY DISTRIBUTION 


311 


where c'(5) = c{,S)/k{8) and dv{x) = 4>b{x) dfi{x). Truncation has not changed 
the relative density function, and the result follows from the form of (1). 

Next suppose that, instead of accepting values with probability one in E 
and with probability zero outside E, we select according to a fixed Borel function 
the chance of accepting a value x being 4{x). The new family of distribu¬ 
tions lias the same sufficient statistics for the same reason, 

REFERENCES 

[1] R, A Fisiieb, “Tlieory of statistical estimation,” Camh Phil Soc Proc , Vol 22 

(1923-25), pp. 700-725 

[2] R. A. Fisher, "The sampling error of estimated deviates together with other illustra¬ 

tions of the properties and applications of the integrals and derivatives of the 
normal error function," Brtf. Assn. Ado. Sct. Maihemalical Tahhs, Vol. 1, xxvi-xxxv. 

[3] H, Hotelling, “Abstracts of Madison Meeting," Annals of Math. Stat, Vol. 19 (1948). 
14] B. 0. Koopm\n, “On diatrlbutions admitting a sufficient statistic,” Trans. Amer. 

Math. Soc., Vol. 39, pp. 399-409. 


ON A PROBABILITY DISTRIBUTION 
By Max A. Woodbury 
University of Michigan 

1. Introduction. The problem treated is that of generalizing the Bernouilli 
distribution to the case where the probability of success is not constant from trial 
to trial but depends on the number of previous successes. The case where the 
probability of an event depends on the number of trials is easily handled and 
is not the case treated here. Several special cases of such a distribution have 
been worked out at one time or another. (E.g C. C. Craig found the solution for 
one such special case and thus called the author’s attention to the problem) 

The solution involves the Newton divided difference expansion of powers in a 
form which can be utilized for computation if the number of trials is not too 
large. In the case where the probabilities on a single trial are small an approxi¬ 
mation, (similar to that of the Poisson distribution to the Bernouilli distribution) 
ean be found. 

Applications can obviously be made to urn schema in which black balls are 
replaced, but white balls are removed. Similarly, applications can be made to 
the distribution of the number of plants in a given area. 

2. Solution of the problem. Specifically the problem is as follows: “What is 
the probability that in n trials of an event it will occur x times presuming that 
the probability of the event on a given trial depends only on the number of 
previous successes’” Denote by P(n, x) the probability of x successes in n 
trials and by p* the probability of the event after x previous successes. As 



312 


MAX A. WOODBURY 


conventional denote = 1 - p. and one can formulate the following equation 
of partial differences: 

(1) P(n + 1, a: + 1) = p*P(n, x) + g*+)P(n, x + 1). 

This equation is an obvious consequence of the statement that a: + 1 successes 
in n + 1 trials can only occur if there are x successes in n trials and a success on 
the n + 1st or a: + 1 successes in n trials and failure on the n + 1st, The 
boundary conditions appropriate are: 

(2) P(n, a:) = 0 for a: < 0, or a; > n and F(0, 0) = 1. 

It is convenient and appropriate to generalize (1) while retaining the boimdary 
conditions (2). The equation (1) will be obtained from the following equation 
by setting g = 1: 

(3) Pin + 1, a: + 1) = (g - g,)P(n, a:) + gi+iP(«, x + 1). 

It will be noted for further reference at this point that: 

(4) Pin, 0) = g? 
and: 

(5) Pin, n) = (g - go)(g - gO • • • (g - g«-i). 

This last suggests a change of variable of the form: 

(6) Pin, a:) = P(n, x)iq - go)(g - gi) • • • (g -- g,). 

Upon substituting this expression in (3) one obtains a somewhat simpler equation 
with the same boundary conditions as (2). 

(7) Fin + 1, a; + 1) = P(n, a:) + Qz+iFin, x + 1). 

Using the generating function: 

(8) ff(x.i) = i:P(n,a:)r 

n—* 

one may obtain from (7), using the boundary conditions (2) the following 
ordinary linear difference equation: 

(9) Oix + !,{)= t) + i.+iGix + I, {)]. 

From (4) it is easily seen that: 

(10) (?(0, f) = 1/[1 - gcf], 
and hence that the solution of (9) is: 

(11) Gix, e) = f/m - go{)(l - git) ■ ■. (1 - g4)]. 

This may be expanded in partial fractions and the result written: 

Z 

(12) Gix, t) = t*S g</[(gt ~ go) • ■ ■ iq, — gi-i)(g.' — gi+i) ■ • ■ (gi — g*)(l ~ gif)]- 

i 



SAMPLE SIZE DETBBMINATION 


313 


By moans of the relation in (8) one deduces readily that: 

X 

(13) F(n, x) = T, 5t/[(g. - go) • • ■ (g. - g,-i)(gi - g.+i) • • • (g. - g*)]. 

t-O 

Jordan [1, p. 19, eq. (1)] shows this to be the xth Newton divided difference of g" 
where the expansion is in terms of (g ~ go) ■ ■ • (g — g*), for a : = 0, 1, > ■ ■ , n. 
The solution for (3) can now be written as: 

(14) P(n, a:) = (g - go) • ■ • (g - qx-i)F„(x) 
from which follows; 


(15) 


E P(n, X) 


x-O 


As remarked before, by setting g = 1 one obtains the solution of (1) subject to 
the boundary conditions (2). 

It IS clear that when all the g. are equal that the Bernouilli distribution should 
come out as a special case. Since in this case the divided difference becomes the 
corresponding derivative divided by the appropriate factorial, one obtains; 


(16) 


B(n, a:) = 


(1 - goY d^q’' 

xl dg^ 




Upon reduction this yields the usual formula, but not in the usual way. 

By choosing = \x/n and allowing n to increase without limit one obtains 
an analogue of the Poisson distribution, viz: 


(17) Pix) = (-}.o')- • • (-^.) Ze-'VI(Ao-Xi)- - • (X.-i-X.)(\ t+l~X<) ■ ■ ■ (Xr —X,)] 


which corresponds to the expansion of e about Xo, Xi, Xz, • • •, Xi, ■ • • when X = 0. 


REFERENCE 

[1] Charles Jordan, Calculus of Finite Differences, Chelsea Publishing Co., New York, 
2nd ed, 1047. 


A GRAPHICAL DETERMINATION OF SAMPLE SIZE FOR WILKS’ 

TOLERANCE LIMITS 

By Z. W. Bibnbaum and H. S. Zuckeeman 

University of Washington 

1. Summary. To determine the smallest sample size for which the mini¬ 
mum and the maximum of a sample are the 100|8% distribution-free tolerance 
limits at the probability level «, one has to solve the equation 

(1) - (N - 1)0" = 1 - « 



314 


Z. TV. BIRNBAmi AND H. B. ZUCKERMAN 


given by S. S. Wilks [1]. A direct numerical solution of (1) by trial requires 
rather laborious tabulations. An approximate formula for the solution has 
been indicated by H. Scheff4 and J. W. Tukey [2], however an analytic proof for 
this approximation does not seem to be available. The present note describes 
a graph which makes it possible to solve (1) with sufBcient accuracy for all 
practically useful values of p and t. 

2. Construction of the graph. Substituting in (1) 


we obtain 


1 + ® = (1 — «)/9 * 

and 

(2) log (1 + x) = - log log X. 

To solve (2) graphically, one has to find the intersection of the curve 

(3) y = log (1 + x) 
with the line 

-log ri_ + (^ 

To prepare a graph on which this can be done, one first plots (8) once for al 
(Figure 1, Curve C). Then one marks the points — log ^ ^ on the y-axis 
and labels them with the values of e (Figure 1, Scale I); chooses a constant r > 0 
and marks the points r log ^^^^on the x-axis (Figure 1, Scale II); chooses a eon- 

Q 1 

stant k > 0, marks the points hr j~ 37 ^ 1 og- on the x-axis, draws vertical lines 

through each of these points, and labels them with the values of p (Figure 1, 
Scale III); draws the line x *= fc (Figure 1, line L); marks the uniform Seale IV 
on the x-axis. 

The graph reproduced here has been prepared with r = 4, jfc = 6. It can 
easily be verified that the instructions on the graph lead to solutions x of (2) and 



SAMPLE SIZE DETERMINATION 


316 



a 

o 


o 

CQ 


«Q. 

Xl 

u 

a 


a 


3) oonnect t on Scale I with Q-, the connecting line cuts curve C at a point which has abscissa x on Seale IV; read off x. 





















316 


Z. W. BIRNBAUM AND H. S. ZUCKERMAN 


3. Improvement by iterations. The graphical solution, usuallj’’ accurate to 
two significant digits, may be improved easily by iterations. Replacing (2) 
by the equation 


(4) 


X = 


log (1 + x) + log 



l3 

1 - P 



one obtains iterations 3;,+i = fix,) which, for .80 < e < .999 and .80 < /3 < 999‘ 
converge rapidly to the solution of (2). 

Example. For c = .99, (3 = .999, one finds graphically xi = 6.6, and from 

(4) the iteration formula x,+i = ^ which yields the values Xi = 

6 642, Xi = 6.648, Xi = 6.649, Xs = 6.649. Rounding up we obtain the sample 
sizeJV = 6.649'999 = 6643. 

For e and 0 between .80 and .999 all iterations obtained from (4) are on the 
same side of the exact solution and converge to it monotonically. Thus, in our 
example, from xi < X 2 we conclude that Xi as well as all further iterations are 
smaller than the exact solution. 


HurBRENCIiB 

[1] S. S. Wilks, Maihmatical Slaliahca, Princeton University Press, 1943, p. 94. 

12] H. ScHBrr^ and J W Tukby, “A formula for sample sizes for population tolerance 
limits,” Annals of Math. Slat,, Vol. 16 (1944), p. 217. 



ABSTRACTS OF PAPERS 

{Abalracls of jiapera presenled ai the New York meeting of the Insiiiuie on April 8-9,1949) 

1. Adjustment of an Inverse Matrix Corresponding to a Change in One Ele¬ 
ment of a Given Matrix. Jack Shkhman and Winiphed J. Mohrison, The 
Texas Company Research Laboratories, Beacon, New York. 

If one element, Oxb , in n square matrix A is changed by an amount Aaxa , all the elements 
bij in the inverse matrix B are generally changed. A simple equation has been derived by 
means of which the elements bn in the resulting inverse matrix B' can be computed directly 
in terms of Anna and the elements of B. The equation is 

,, , bsibixAans 

Oti = Oi; — , T- 

1 + OBRAaxa 

It follows that any given square matrix can be transformed into a singular matrix by 
increasing any one element in the transposed inverse matrix. 

2. The Distribution of the Number of Exceedances. E . J. Gttmbbl, New York 
and H. VON Schelling, Naval Research Laboratory, New London, Conn. 

The probability for the mth observation in a sample of size n taken from a population 
with an unknown distribution of a continuous variate to be exceeded a: times in N future 
trials is studied. The averages, moments, and the cumulative probability of the number 
of exceedances are calculated with the help of the hypergeometric series. The tolerance 
limits constructed by Wilks are special cases of the cumulative probability. The mean 
number of exceedances is the same as in Bernoulli’s distnbution. In some cases there arq 
two modes, namely m — 1 and m - 2. If n = AT, the most probable number of exceedances 
over the with largest value is either m,oim — 1, and the median number of exceedances is 
equal to m — 1. In 50% of all cases, the largest (smallest) of n past observations will not 
(always) be exceeded in n future observations. If n and JV are both large and equal, the 
distnbution of the number of exceedances over the median is normal whereas the distnbu- 
tion of the extremes, similar to Poisson’s distnbution, has a mean m, and a vananoe 2m. 
The yanance of the number of exceedances is largest for the median, and smallest for the 
extremes of the previous sample. These distribution-free methods may be applied to 
meteorological phenomena, such as floods, droughts, extreme temperatures (the killing 
frost), largest precipitations, etc., and permit the forecasting of the number of cases sur¬ 
passing a given severity. 

3. Note on the Power Function of a Quality Control Chart. Leo A. Aroian, 
Hunter College, New York. 

The power function of a quality control chart is given for a sequence of N sample points 
in terms of a andv, the probability of a Type 1 error and the power function respectively for 
a single sample point. Two different models are considered and the generalization to two 
quality control charts is indicated, 

4. Tests Between Two Means or Regression Coeflacients When Observations 
are of Unequal Precision. Uttam Chand, University of North Carolina, 
Chapel Hill. 

Relative merits of different tests available for testing two means or two regression coef¬ 
ficients in relation to asymmetric and symmetric aspects of Student’s hypothesis in ease 
of unequal population variances have been reconsidered In this connection the distribu- 

317 



318 


ABSTRACTS OP PAPERS 


tion of a certain quantity /* where k le some inexact value of the unknown ratio of variances 
has been obtained. The hypothesis of the equality of two linear regression funotiona in 
ease of unequal residual variances has also been considered. 

5. Functional Expansions. Etigene W. Pike, Boston, Massachusetts. 

This paper calls attention to a new type of estimation problem, arising both in the inter¬ 
pretation of experimental data from complex experiments, and m the design of analogue 
computers for functions of several independent variables. 

It has long been known, though not widely recognized, that the partial sums of rows and 
columns arising in the bivariate analysis of variance represent the least squares fit of a 
functional form [/(a:) -|- ji(y)] to a tabular function y) of two independent variables, for 
example More recently, several people have realized gradually that independent causes 
may combine m much more complicated ways to produce a common effect, and that corre¬ 
spondingly more complicated functional combinations, such as [/(i) -f g(y) -f h(*)-fc(p)], 
can be fitted by least squares to tabular functions of x and y. 

Examples of such expansions, as applied both to the design of computers and to the 
analysis of experimental data, will be given. 

This presentation is based on work supported by the Air Materiel Command, USAP. 

6 . The Geometric Range for Distributions of Cauchy’s Tirpe. E . J. Gumbbl, 
New York City, and R, D. Keeney, Metropolitan Life Insurance Oomany, 
New York City. 

From each of N samples of large size n the largest and the smallest values X,., and Xi,, 
(f >= 1,2, • • ‘ X) are taken, where each X is measured from the central value of Nn observa¬ 
tions. The sample size must be so large that the probability of any extreme Xn.r and —Xi., 
being negative may be neglected. The distribution of the geometric means p of the N pairs 
of extremes henceforth called geometric ranges, is derived under the assumption that the 
initial distribution is symmetric, unlimited and of the Cauchy type which implies that the 
moments of an order equal to, or larger than h(k > 0) diverge. Lot u be the expected larg¬ 
est value. Then the probability density of f* != obtained from a theorem of Eltving 

(Biomeinka, Vol 35) is where Kq is a Bessel function. This permits oaloulation of 

nil moments of . Methods are given for estimating the parameters u and k The distri¬ 
bution of the geometric ranges p is again a Bessel function. A probability paper is con¬ 
structed for testing the hypothesis that the initial distribution is of Cauchy’s type, A 
strict parallelism is established between the asymptotic distributions of the range for the 
exponential type, and of the geometric range for Cauchy’s type. This provides a criterion 
to which of the two types the initial distribution belongs. 

7. On Sums of Random Integers Reduced Modulo m, A. Dvorbtzky, Insti¬ 
tute for Advanced Study, Princeton and J, Wolfowitz, Columbia Univer¬ 
sity, New York City. 

Let Xn , (n = 1, 2, ■ ■ ) be an infinite sequence of independent, integral-valued, ohanoe 
variables, and let m be any fixed integer greater than 1. Put S„ => 53"_iX, and denote iS„ 
reduced mod. m by Tn ; i.e,, y„ is a random variable wbioh assumes only the values j = 
1,2,' • , TO with respective probabilities P„(j) = Prob 3 J (mod. m)}. Necessary and 
sufficient conditions are obtained for Yn to be equidistributed in the limit, i.e , for 

■Pn(j) “ C) “ I, 2, • • , m.) Some easily applicable sufficient conditions are deduced 

and the oases to = 2, 3, 4 are studied in detail. The rapidity with which Pnij) — is also 
studied 



ABSTRACTS OF PAPERS 


319 


8 . The Corpuscle Problem: Estimating the Surface-Volume Ratio of a Cor¬ 
puscle of Arbitrary Shape. Jerome Cornfield, National Institutes of 
Health and Harold W, Chalklbt, National Cancer Institute, Bethesda, 
Md. 


Consider a apace containing F, a closed figure of arbitrary shape, volume V and surface 
area S. Let a line segment of length r be thrown in the space in such a fashion that we have 
uniform distribution of the probabilities that the end point P occupies any position in the 
space and that the other end point P' occupies any position on the surface of a sphere of 
radius r with center at P. Count the number of end points falling in F (0,1 or 2 for a single 
throw), call it the number of hits, and denote it by h. Count the number of times the line 
intersects the surface (0,1 or 2 times for a single throw for a non-reentrant figure, possibly 
more for a re-entrant one), call it the number of cuts and denote it by c. Then, it is proved 
that rE{h)/E{c) = ^V/S This result is intended to provide a theoretical basis for esti¬ 
mating the surface-volume ratio of physical objects of any shape. 

9 . Generalized Hit Probabilities with a Gaussian Target. D . A . S . Fraser, 
Princeton University. 

In the Supjilemenl to the Journal of the Royal Statutical Society, Vol. 8 (1946), L. B C. 
Cunningham and W. R B. Hynd proposed a problem and gave an approximate solution cov¬ 
ering a partial range of parameter values: to fin'l the probability that a moving target will 
survive a burst of “n” rounds from a rapid-firing gun, account being taken of correlation 
between the different points of aim. 

Generalizing from the case of a two dimensional target to “fc” dimensions, this paper 
gives the probability for 0,1,2, ■ • • n hits, under the following assumptions the "n” points 
of aim have a Multivariate Gaussian Distribution, the dispersion error has a Gaussian Dis¬ 
tribution, and the target is a Gaussian Diffuse Target, that is, the probability of a hit on a 
particular round as a function of the coordinates of the shell has the form of “a constant 
times a Gaussian probability density function.” 

Limiting distributions are obtained as n. —► «, subject to a variety of limiting conditions. 

Numerical values for the probability of at least one hit are plotted when n — 6, for a 
range of values, relative to the target size, of dispersion and aiming errors 

10. A Hew Continuous Sampling Inspection Plan Based on an Analysis of Costs. 

F. E. Satterthwaite, General Electric Company, Bridgeport, Connecticut. 

Inspection, like all other industrial operations, must be run to produce the most return 
for the lowest cost. The costs include overhead and running inspection costs; complaint 
costs; rework and scrap costs; and the costa of unnecessary process rejections Also one 
must consider the frequencies of occurrence of these costs. These include the process aver¬ 
age percent defective; the probability of occurrence of a complaint, and the frequency of 
occurrence of quality deteriorations. 

For continuous inspection, the percentage of the product to bo inspected has a very 
simple formula' P = SC/HM, where iS is the sensitivity of the sampling plan used, C is 
the complaint cost, H is the effective inspection cost, and 1/Jkf is the quality deterioration 
rate. 

It was also necessary to develop a new continuous sampling inspection plan which would 
be efficient over the entire range of continuous sampling applications. The plan presented 
is a sequential plan which, with suitable attention to details, is easily applied on the shop 
floor. The Dodge Plan is a special case and is efficient only in a small percentage of appli¬ 
cations. 



320 


abstracts of papers 


11. Oa the Levels of Significance of the F and Beta Distributions. Leo A. 
Aroian, Hunter College, New York. 

Two formulas are given for the determinations of the levels of significance of the F and 
Beta distributions In the case of the F distribution a previous set of formulas (Biometrika, 
Vol. 34, pp 369-360) is modified to give 3 significant figure accuracy, ni, wj ^ 24 The set 
for the Beta distribution is of Cormsh-Fisher type, p, ? ^ 0. The advantage of these over 
Paulson’s F formula and Carter’s z formula are the avoidance of the solution of a quadratic 
in the case of Paulson’s formula, and the avoidance of the exponential tables in the case of 
Carter’s z formula A short numerical table compares the three methods for selected values 
of rii and Wj. 


12, Certain Statistics for Samples of 3 From a Rectangular Population, Julius 
Lieblein, National Bureau of Standards. 


A continuation of a study presented at the Madison meeting of the Institute of Mathe¬ 
matical Statistics last September (For abstract see Annals of Malh. Slai., December 1948, 
p 695.) The previous paper derived properties of the statistics 


1/1 




Vt = 


+ 

2 ’ 


where ii, Z 2 , Zj are the observations, ordered by increasing size, in an independent random 
sample of three observations from a normal population, and z' and x", x' ^ x", are the two 
closest of the three. In the present paper distributions (joint as well as simple) are obtained 
for the above three statistics and also for x"*, the remaining observation not included in 
the closest pair, for samples of 3 from a rectangular population, and a theorem is proved 
concerning the distribution of j/i for a wide class of continuous populations. 


13. The Choice of Lot Inspection Plans of the Basis of Cost. F. E. Satter- 
THWAiTE, and Burton Grad, General Electric Company, Bridgeport, Con¬ 
necticut. 

An extension of the first paper to single sampling inspection plans. The important con¬ 
cepts involved are the break-even quality level, the operating ratio, and the weighted prior 
odds that a lot is a good lot Charts are being prepared which can be entered with simple 
functions of the costs and which give directly the sample size and acceptance number for 
the most efficient single sampling inspection plan. 

It appears promising that the method can be extended to double and sequential sampling 
plana This is imperative because of the large portion of the time that “no-mspeotion” is 
the most efficient single sampling plan, 



NEWS AND NOTICES 

Readers are invited to submit to the Secretary of the Institute news items of interest 

Personal Items 

Enrique Loizelier Blanco, Professor of Statistics in the University of Madrid, 
has just finished the first year of experimentation in Quality Control Methods 
in different plants The interest for these new statistical applications started 
in Spain during 1946 and have increased rapidly since then, especially this year 
after consecutive bimonthly intensive courses which Professor Blanco has been 
teaching. 

Mr. Osmer Carpenter, formerly an Instructor in the Department of Statistics 
and Mathematics at Iowa State College is now doing statistical work for Carbide 
and Carbon Chemical Corp., Oak Ridge, Tennessee. 

Dr. K. L. Chung, formerly of Princeton University, has been appointed to an 
assistant professorship at Cornell University. 

Dr. Clyde H. Coombs, Associate Professor of Psychology and Chief of Re¬ 
search Division, Bureau of Psychological Services at the University of Michigan, 
is on leave of absence for the academic year to work at Harvard University on 
problems of scaling. 

Dr. Meyer A. Girshick, formerly with the Douglas Aircraft Co., Santa Monica, 
California, has accepted a professorship in the Department of Statistics, Stanford 
University, Stanford, California. 

Dr. M. J. Gottlieb, who has been with the Institute for Advanced Study at 
Princeton, has been appointed to an assistant professorship at the Newark 
College of Rutgers University. 

Associate Professor E. H. C. Hildebrandt of Northwestern University has 
been elected President of the National Council of Teachers of Mathematics. 
He h also National Secretary-Treasurer of Pi Mu Epsilon and Secretary of the 
J!i4^ematics Section of the Central Association of Science and Mathematics 
Teachers 

Dr. C. A. Hollingsworth, formerly with the Acetate Section of the DuPont 
Company, is now an instructor in the Department of Chemistry, University of 
Pittsburgh. 

Professor William G. Madow, who has been with the Institute of Statistics 
at the University of North Carolina, has been appointed Professor of Statistics 
at the University of Illinois. 

Dr. Zenon Szatrowski, formerly teaching in the Economics Department of 
Northwestern University, has accepted an associate professorship in the Depart¬ 
ment of Economics, University of Oregon, Eugene, Oregon. 

Mr. Eric Weyl has resigned his poation as staff engineer in the Chicopee Manu¬ 
facturing Corporation and is now conducting his own business as a textile en¬ 
gineering consultant in Manchester, New Hampshire. 

321 



322 


NEWS AND NOTICES 


New Members 

The following peisons havo boon elected to membership in the Institute (December 1, 

1948 to February 2S, 1940). 

Abnizzl, Adam, M.S (Columbia Univ.) Student in engineering at Columbia University, 
22 W. 107lh Slrael, Shanks Village, New York. 

Agarwal, Satya P., M.A, (Agra Univ., India) Student at Univcisity of California, Inter¬ 
national House, Berkeley 4, California. 

Anderson, Robert W., M.A (Columbia Univ.) Student at Columbia University, 21428-1/2 
Road, Queens Village S, New Yoik. 

Bahadur, R. R., M,A. (Univ. of Delhi, India) Graduate Student at University of North 
Carolina, Chapel Hill, North Carolina. 

Blom, Gunnar, Fil.kand. (Stockholm) Olof Skolkonitnys vag S, Aspudden, Sweden 

Burrows, Glenn L., M.A. (Michigan State College) HcRcarch Associate, P.O. Bo.\ 168, 
Institute of Mathematical Statistics, Chapel Hill, North Carolina 

Chapman, Carlos A., Jr., M S (Univ of Michigan) Sales Statistician, Argus, Inc., Ann 
Arbor, Michigan, S34 Y/. Huron St , Ann Atboi , Mich. 

Chlang, Chin Long, M.A (Univ of Calif.) Student at tho University of California, SS5-A. 
PanoramieWoy, Berkeley 4, California 

Coggins, Paul B., M S. (Univ of Wisconsin) Graduate Teaching Assistant, University 
of Michigan, University Club, Madison B, Wisconsin. 

Crapsey, Marcus T., A.B. (Univ, of Michigan) Graduate student at tho University of 
Michigan, GIB Monioe, Ann Arbor, Michigan. 

Coy, John W., MA. (Univ. of New Mexico) Teaching Fellow, Department of Mathe¬ 
matics, University of Michigan, £044 Whilewood, Ann Aibor, Michigan. 

Cutkosky, Richard E., Student at Carnogio Institute of Technology, Box 40t, Carnegie 
Institule of Technology, PiUshurgh, Pennsylsania. 

DelPrlore, Francis R., B.A. (New York Univ.) Associate Statistician, U S. Naval En¬ 
gineering Experiment Station, Slreei, N.E., Waahinglon 18, D. C. 

Deslnd, Philip, M,S. (College of City of N. Y.) Statistician, Bureau of Ships, Navy 
Department, Washington, D. C , 7418 Georgia Ave., A'.IF., Wnshinglon, D. C. 

Dutka, Solomon, M A (Columbia Univ ) Chief Statistician, % Elmo Ropor, 30 Rocke¬ 
feller Plaza, Now York City, New York. 

Dwass, Meyer, B.A. (George Washington Univ.) Graduate student at Columbia Uni¬ 
versity, Apt. 3A, 809 W. Its Si., New York, New York 

Eastman, Walter F., A.B. (Harvard) Central Technical Department, The American 
Brass Co , Waterbury, Connecticut. 

Eisenpress, Harry, B.A (College of City of N. Y.) National Bureau of Economic Re¬ 
search, 1819 Broadway, New York 23, New York, 89SS Ocean Parkway, Brooklyn £4, 
New York, 

Fellows, Clifford Martin, B.S, (Boston Univ.) Assistant Instructor, Boston University, 
Bureau of Research and Statistics, 68S Commonwealth Avenue, Boston IB, Massachusetts. 

Gowen, John W., PhD. (Columbia Univ.) Professor of Gonotios, Genetics Department, 
Iowa Slate College, £014 Kildee, Ames, Iowa. 

Greenwood, Robert E., Ph.D, (Princeton Univ.) Assistant Professor of Applied Mathe¬ 
matics, University of Texas, 1704 Windsor Road, Austin, Texas 

Hald, Anders, Ph.D. (Univ. of Copenhagen) Professor of Statistics, University of Copen¬ 
hagen, Emdrupvenge 94, Copenhagen 0, Denmark 

Helms, William R., Student at Ohio State University, Stadium Club, Ohio State University, 
Columbus 10, Ohio, 

Hemphill, F. M., M.S Ph. (Univ. of Michigan) Major, U. S. Publio Health Service, School 
of Public Health, University of Miohigan, Ann Arbor, Michigan. 



NEWS AND NOTICES 


323 


Himes, Harold W., B.S. (George Pepperdine College, lios Angeles) Statistician, Test 
Design and Analysis Section, U.C D.W.R., U. S. Navy Electronics Laboratory, San 
Diego 52, California. 

Hutchinson, L. Charles, Ph.D. (Mass. Institute of Tech.) Associate Professor of Mathe¬ 
matics, Polytechnic Institute of Brooklyn, Brooklyn, New York. 

Klnhr, Carl N., M S. (Carnegie Institute of Tech.) Student, Atomic Energy Commission 
Eellow, Carnegie Institute of Technology, 8SS7 Phillips Avenue, Pittsburgh 17, Penn¬ 
sylvania 

Kraemer, Herbert F., B S (Univ of Delaware) Statistical Engineer, Technical Super¬ 
visor, Commercial Solvents Corporation, Terre Haute, Indiana, UH South 7th Si., 
Terre Haute, Indiana. 

Huebler, Roy R., Jr., A.M. (XJniv of Pennsylvania) Associate Professor of Mathematics, 
Dickinson College, Carlisle, Pennsylvania. 

Lafontant, Herne E., MS. (Atlantic Univ.) Student at the University of Michigan, 
SIB Monroe, Ann Arbor, Michigan. 

Lai, Dip Naravan, Ph D. (Edinburgh Univ.) Lecturer in Mathematics, Patna University, 
New Dak Bungalow Road, Patna, Bthar, India. 

Llserre, Guido Orlando G., Profesor de Estadistica, Mendoza SBJfi, Rosario, B., Argentina 

Matson, J. H., B A. (Univ. of Wisconsin) Statistician, Baker Manufacturing Company, 
Evansville, Wisconsin, 

Monsch, Henry H., B.S. (Missouri School of Mines & Metallurgy, Rolla) Metallurgist, 
Aluminum Company of America, Fabricating Division, Alcoa, Tennessee, BS07 Lake 
Shore Drive, Knoxville, Tennessee 

Moore, Lucius T„ Ph D. (John Hopkins Univ.) Associate Professor, Department of 
Mathematics, Brooklyn College, SOB Hicks Street, Brooklyn, New York. 

Noack, Albert, Ph.D. (Kiel, Germany) Privatdozent, Studienrat, (S4a) Hamburg- 
Lokstedt II, Ttbarg S8, Germany. 

Patton, Robert E., A B. (N Y. State Teachers College, Albany) Graduate student at the 
University of Michigan, SSS Linden St., Ann Arbor, Michigan. 

Potter, Muriel, Ph.D. (Columbia Univ.) Instructor in Psychological Foundations, Edu¬ 
cational Rcsearoh and Reading Supervisor, Teachers College, Columbia University, 
414 Riverside Drive, New York SB, New York. 

Putz, Robert R., B.A (Univ of Minnesota) Teaching Assistant, Department of Mathe¬ 

matics, University of California, 1631 Cornell Avenue, Berkeley S, Califomia. 

Ratoosh, Phllburn, M.A. (Columbia Univ.) Assistant in Psychology, Department of 
Psychology, Columbia University, New York 27, New York 

Richardson, Wyman, Jr., S.B. (Harvard) Graduate student at the University of North 
Carolina, S08-B, Chapel Hill, North Carolina. 

Rosenbaum, Sidney, M A. (Cambridge) Scientific Ofiioer, Ministry of Works, 31, Multon 
House, Shore Place, London E.g., England 

Savage, I. Richard, M.S. (Umv. of Michigan) Student at Columbia University, 1414 
John Jay Hall, New York 27, New York. 

Sheerln, Gall, A B (Univ. of Rochester) Statistical Technician, A.E.C. Project, Uni¬ 
versity of Rochester, 1091 Highland Avenue, Rochester, New York. 

Slegert, Arnold J. F„ Ph.D. (Leipzig, Germany) Professor of Physios, Department of 
Physics, Northwestern University, Evanston, Illinois 

Simpson, Paul B., Ph.D. (Cornell Univ.) Assistant Professor of Economics, Department 
of Economics, Stanford University, California. 

Solem, Anson D., M.S. (Harvard Univ.) Chief of Fragmentation Section, Naval Ordnance 
Laboratory, White Oak, Maryland, ISl Galveston St., S.W., Washington SO, D. C 

Sorensen, Frederick A., B S. (Carnegie Institute of Tech.) Teaching Assistant in Mathe¬ 
matics, Carnegie Institute of Technology, 1S04 East End Avenue, Pittsburgh 18, Penn¬ 
sylvania. 



324 


NEWS AND NOnOES 


Steel, Robert G. C., M.A. (Acadia Univ., Canada) Instructor and Researcli Associatei 
Statistical Laboratory, Iowa State College, Ames, Iowa 

Taylor, Francis B., A.M. (Columbia Univ) Instructor in Matliemnlies, Manliattan 
College, New York and Graduate student at Columbia University, 34S E. 193 St, Bronx 
68, New York. 

Terrell, James R., AB, (Univ. of Michigan) Statistical Clerk, Research Center for 
Group Dynamics, P 0 Box 351, Ann Arbor, Michigan. 

Tick, Leo J, B,S (Iowa State College) Research Graduate Assistant, Statistical Labora¬ 
tory, Iowa Slate College, Ames, Iowa. 

Tyler, Sylvanus A., S.M. (Univ- of Chicago) Associate hlathenmtician (Biometrics), 
Argonne National Laboratory, P.O Bov 5207, 9059 So. Slmarl Avenue, Chicago 30, 
Illinois. 

Tysver, Joseph B., M.A (Washington State College) Teaching Follow, University of 
Michigan, UOJ). Ening Court, Willow Run Village, Michigan. 

Umarjl, Raghavendra R., AM (Columbia Univ.) Lecturer in Mathemnties, Bombay 
Educational Service, 609 John Jay Hall, Columbia University, New York 27, New York. 

Wilburn, A. J,, A.B. (Howard Univ.) Statistician, Civil Aoi*onautics Board, Washington, 
D. C., 35-46th Place., N E , Washington, D. C. 


Correction 

The information following Paul Koditsohek's name which appeared in the March issue 
of the Annals, page 149, should have appeared as follows: 

Kodltschek, Paul, Ll. D. (Univ. of Vienna) Research Assooiato, Scientific Research Service, 
319 W. ISlh Street, New York li, New York, 

(It was implied in the original notice that Scientific Research Service is connected with 
Columbia University.) 


News Item from Cornell 

With the continued support of a research contract with the Office of Naval Reaearoh, the 
Mathematics Department of Cornell University is further expanding research and instruc¬ 
tion in the theory of probability and its applications. At present Professors Feller, Kao, 
Chung and Dr. Donsker are participating in the work. Professor G. Elfving of the Uni¬ 
versity of Helsingfors has been appointed Visiting Professor of Mathematical Statistics 
for the academic years 1949-1961. Professor J. L. Doob, on sabbatical leave from the Uni¬ 
versity of Illinois, will spend the year 1949-50 at Cornell. Dr. Gilbert Hunt has been ap¬ 
pointed Assistant Professor of Mathematics. 



REPORT ON THE NEW YORK MEETING OF THE INSTITUTE 


The thirty-eighth meeting of the Institute of Mathematical Statistics ivas 
held at Columbia University, New York City on Friday afternoon and Saturday, 
April 8-9, 1949. The meeting was attended by 93 persons including the follow¬ 
ing 80 members of the Institute: 

A, Abruzzi, T. W. Anderson, Leo A. Aroian, Robert Beohhofer, A, A. Bennett, Joseph 
Berkson, Allan Birnbaum, C. I. Bliss, Paul Boachan, P. G. Carlson, Uttam Chand, Yunien 
Chen, E. P, Coleman, T E Cope, Jerome Cornfield, L M Court, M, I Cropsen, J H. Cur¬ 
tiss, Cuthbert Daniel, F R Del Priore, W. E. Deimng, J. A Dudman, David Durand, 
C. W. Dunnett, A Dvoretzky, P. S. Dwyer, Churchill Eisenhart, H. L Edgett, Harry 
Eisenpress, Lillian R. Elvebaok, D. A. S. Fraser, Murray Geisler, L. A. Goodman, J. I. 
Griffin, 0 C. Grove, E J. Gumbel, Miriam S, Harold, Mina Haakind, L H. Herbaoh, Hari^ld 
Hotelling, Cuthbert Hurd, Arthur Kaufman, Roger D, Keeney, Paul Koditsohek, Cail F 
Kosaaok, Howard Levene, Jack Laderman, I. D. Lorge, C. L Marks, Paul Meier, Frederiifk 
Mosteller, E B, Mundie, C. M Mottley, I. U Mulk, Paul Neurath, G. E. Noether, Doris 
Newman, M. L Norden, E. W. Pike, J. K Perrin, H. M. Rosenblatt, Frank Saidel, William 
Salkind, F. E Satterthwaite, Richard Savage, Henry Scheff6, H L, Seal, Jack Shermai), 
Rosedith Sitgreaves, J. H Smith, J J Sodano, Herbert Solomon, Mary N Torrey, J, W. 
Tukey, S. S Wilks, D. F, Votaw, Helen M Walker, Lionel Weiss, Jack Wolfowitz and 
W. W. Wryht, 

The Friday afternoon session consisted of a Symposium on Apphcaiions of 
Multivariate Analysis, Professor S. S. Wilks of Princeton University presiding. 
The following two invited papers were given: 

1. Tests of Differences m Composite Growth Measurements in Pig Feeding Trials, J. Wishart, 
Cambridge University and University of North Carolina. 

2 Fields of Application of Multivariate Analysis, Harold Hotelling, University of North 
Carolina. 

The prepared discussion was presented by Professor S N. Roy, Presidency 
College, Calcutta, and Columbia University, followed by discussion from the 
floor. 

The Saturday morning sesssion was opened by a business meeting, Dr. 
Churchill Eisenhart, National Bureau of Standards, presiding. Among other 
items of business the Constitution of the Institute was amended to provide for 
Institutional Membership, and the by-laws amended to specify the status and 
privileges of Institutional Members. The revised Constitution and By-Laws 
appear elsewhere m this issue. 

The second part of the session, Dr. W. Edwards Deming presiding, was 
devoted to an invited address: Non-ldnear Regression Laws and “Internal Least 
Squares," by Dr. H. 0. Hartley, University College, London and Princeton 
University. 

At the Saturday afternoon session, Professor Henry Scheff6, Columbia Uni- 

325 



32G 


BEPORT ON NEW TORK MEETING 


■\'ersily, presiding, llie following contributed papers were presented, ten in 
person, three by title: 

1, Adjwlmeni of an. Iiwcrae Mahix Corresponding to a Change in One Element of a Oiven 
Matrix. 

Jack Sherman and Winifred J. Morriaon, The Texas Company Rcaearch Laboratories, 
Beacon, N. Y. 

2 The Distribution of the Number of Exceedances. 

E. J. Gumbel, New York, N. Y., and H. von Schelling, Naval Ileaoarch Laboratory, 
New London, Connecticut. 

3. Note on the Power Function of a Quality Control Chart. 

Leo A, Aroian, Hunter College 

[4 Tests between Two Means or Regression Coefficients W/ien Obseruations Are of Unequal 
Precision. 

Uttam Chand, University of North Carolina, 

S. Functional Expansions. 

Eugene W. Pike, Boston, Massachusetts. 

6 The Geometno Range for Distributions of Cauchy’s Type. 

E J Gumbel, New York, N Y., and R. D. Keeney, Metropolitan Life Insuranoe Com¬ 
pany, New York, N, Y. 

7. On Sums of Random Integers Reduced Modulo m. 

A, Dvoretzky, Hebrew University, Jerusalem, and Institute for Advanced Study, and 
J. Wolfowitz, Columbia University. 

S The Corpuscle Problem.' Estimating the Surface-Volume Ratio of a Corpuscle of Arbitrary 
Shape. 

Jerome Cornfield, National Institute of Health, and Harold W. Chalkoy, National 
Cancer Institute, Bethesda, Maryland, 
f). Generalized Hit Probahilities with a Gaussian Target. 

D, A S. Fraser, Princeton Univeisity. 

10 A New Continuous Sampling Inspection Plan Based on an Analysis of Costs. 

F E. Satterthwaite, General Eleotrie Company, Bridgeport, Connecticut. 

11 On Levels of Significance of the F and Beta Distributions. (By title) 

I.eo A, Aroian, Hunter College. 

12. Certain Statistics for Samples of 3 from a Rectangular Dtslribulion. (By title) 

Julius Lieblein, Statistical Engineering Laboratory, National Bureau of Standards. 

13, The Choice of Lot Inspection Plans on the Basis of Cost. (By title) 

F. E Satterthwaite and Burton Grad, General Electric Company, Bridgeport, Con¬ 
necticut. 

On Friday evening a dinner was held at, the Men’s Faculty Club. 

S. B. Littaubr 
Assialant Secretary 



CONSTITUTION OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 

ARTICLE 1 
PUEJOSB 

The Institute of Mathenaatical Statistics is a society for encouraging the 
development, dissemination, and application of mathematical statistics. 

ARTICLE 2 

Members 

The Institute shall have Members and Institutional Members Applications 
for membership must be approved by the Council. The Council may delegate 
this authority. 

Except for nonpayment of dues, no Member or Institutional Member shall be 
expelled or suspended except by three-fourths vote of the Council. 

ARTICLE 3 
Officers 

The Officers of the Institute shall be the President, the President-Elect, the 
Secretary, the Treasurer, and the Editor. The terms of office of the Secretary, 
the Treasurer and the Editor shall be three years. The terms of office of the 
President and the President-Elect shall be one year. The President-Elect shall 
succeed the President in that office. If the President is incapacitated, the 
President-Elect shall act as President, or, in case the President-Elect is also 
incapacitated the Secretary shall so act. Incapacity shall be determined by the 
Council. 

The President shall act as chairman of the Council and of the Executive Com¬ 
mittee, and shall appoint the Committees and representatives of the Institute, 
with the exception of the Committee on Fellows and the Executive Committee. 
Such Committee appointments shall be for terms of not more than three years, 
provided that committee appointments extending beyond the current year shall 
be either to standing committees with regularly rotating membership, or to 
temporary committees assigned specific tasks. 

The Treasurer shall present financial statements to the Council and shall bring 
condensed statements to the attention of the Members. 

The Secretary shall record the actions of the Council and of the Executive 
Committee and of Institute meetings, arrange for and inform the Members of 
meetings and conduct the correspondence of the Institute except as otherwise 
assigned by the Executive Committee. The Secretary may appoint Assistant 
Secretaries to assist him in connection with specified meetings or for other 
occasions. The offices of Secretary and Treasurer may be combined. 

327 



328 


CONSTITUTION AND BY-LAWS 


ARTICLE 4 
Council 

The Council shall consist of not leas than tovelve elected members in addition 
to the Officers of the Institute except that vacancies in the Council occurring 
subsequent to an eiection shall not be filled until the next annual election. 

Elected members shall be elected for terms of three years, the terms of approxi¬ 
mately one-third of them terminating each year. 

The Council, representing the Members, shall determine the policies and 
supervise the affairs of tlie Institute in accordance with any Bjdaws the Institute 
may adopt. It shall determine the standing committees of the Institute and 
the number of elected members of the Council. 

The Council shall elect the Secretary, the Treasurer, and the Editor, by 
majority vote The Council shall determine the number, if any, of Associate 
Secretaries, Associate Treasurers and Associate Editors. The Secretary shall 
nominate dissociate Secretaries, the Treasurer shall nominate Associate 
Treasurers, and the Editor shall nominate Associate Editors which the Council 
may elect by majority vote. Such Associate Secretaries, Treasurers, and 
Editors shall be non-voting members of the Council. 

The Council shall meet at least twice a year, usually at times of meetings of 
the Institute, and otherwise at the call of the President or the call of any five 
members of the Council. Any voting member unable to be present may appoint, 
in writing, a representative to speak for him, and such representative shall be 
entitled to vote. A quorum shall be seven persons entitled to vote. Majorities 
and other fractions of the Council are to be based on the number of persons 
present and entitled to vote 

ARTICLE 5 
Executive Committee 

The Officers shall constitute the Executive Committee of the Council, and 
shall conduct the affairs of the Institute. 

The Executive Committee may create temporary committees with assigned 
tasks coming within the scope of the Institute. 

ARTICLE 6 

Nominations 

The President shall appoint a Nominating Committee and shall announce 
their names at the annual meeting when he retires from office. This Committee 
shall submit to the Members, through the Secretary and at least sixty days 
before the closing of polls at the next succeeding annual meeting, one nomination 
for President-Elect and a slate containing at least twice as many names as there 
are vacancies on the Council. 

Additional nominations may be made for President-Elect or for the Council 
by a petition signed by twenty Members. Such nominations shall appear on 



CONSTITUTION AND BY-LAWS 


320 


the ballot if they are in the hands of the Secretaiy at least 30 days before the 
closing of polls at the next succeeding annual meeting. In any event, Members 
may vote for names in addition to those nominated. 

ARTICLE 7 

Fellows 

The Council, may, by majority vote, elect to fellowship any Member nomi¬ 
nated by the Committee on Fellows. Such nomination and election shall be on 
the basis of the nominee’s contributions to the development, dissemination, and 
application of mathematical statistics. 

ARTICLE 8 
Committee on Fellows 

The Council shall elect two Fellows annually to serve for three years on the 
Committee on Fellows. One of the Members whose term is next to expire shall 
be designated by the President as chairman. 

ARTICLE 9 

Publications 

The Annals of Mathematical Statistics shall be the official journal of the Insti¬ 
tute. Other publications may be authorized by the Council. 

■ The publications of the Institute shall be supervised by the Editor, with the 
assistance of the Associate Editors and such committees as the Council may 
approve. 

ARTICLE 10 
Communications 

Public announcements concerning the Institute, including statements of policy, 
recommendations, reports of committees and accounts of Council meetings shall 
be issued by the Secretary or the President with the prior approval of the Council 
or its Executive Committee. Advance publicity concerning meetings may be 
released by authorized Program Committees or Publicity Committees 

ARTICLE 11 

Affiliation 

By a three-fourths vote, the Council may authorize the affiliation of the 
Institute with any organization whose aims are consistent with those of the 
Institute. 

ARTICLE 12 

Amendments 

This constitution may be amended by an affirmative two-thirds vote of those 
Members voting at any regularly convened meeting of the Institute provided 



330 


CONSTITUTION AND liY-I/AAVS 


notice of such proposed amendment shall have been sent to each Member by 
the Secretary at least thirty days before the date of the meeting at which the 
proposal is to be acted upon. Members may vote in person or by mail. The 
Secretary shall send to the Members any amendments recommended by the 
Executive Committee or proposed through a petition of 25 members of the 
Institute, 

ARTICLE 13 
Emergencies 

In an emergency, as ilctermined by the President or the Executive Committee, 
or by a majority of the Council, a meeting of the Council to transact business 
or a meeting of the Institute to amend the constitution may be conducted by 
mail. 

BY-LAWS OF THE INSTITUTE OF MATHEMATICAL STATISTICS 

ARTICLE 1 
Duties of Officers 

The President, or in his absence the President-Elect, or in his absence a Mem¬ 
ber appointed by the E,xecutivc Committee, shall preside at business meetings 
of the Institute. 

The Treasurer shall send out calls for annual dues, pay all bills for expenditures 
authorized by the Institute, Council, or Executive Committee; keep a detailed 
account of all receipts and expenditures; prepare a financial statement at the 
end of each fiscal year and present an abstract of same at a business meeting of 
the Institute after it has been audited by a Member or Members appointed by 
the President, to whom such Member or Members shall report. 

The Secretary shall, subject to the direction of the Council, have charge of 
the archives and other tangible and intangible property of the Institute and shall, 
upon the direction of the Council, publish a classified list of all hlcmbers of the 
Institute, and of Institutional Members at their request. 

The Editor, subject to the direction of the Council, shall have charge of all 
editorial matters, whether relating to the official Journal or to other publications. 
He shall, with the advice and consent of the Council, appoint an Editorial Com¬ 
mittee of not less than twelve Members to cooperate with him for definite terms. 
All appointments to the Editorial Committee shall terminate with the appoint¬ 
ment of a new Editor. 


ARTICLE 2 
Dues 

Members shall pay seven dollars at the time of admission to membership and 
shall receive the full current volume of the oflficial Journal. Thereafter Members 
shall pay seven dollars annual dues, of which five dollars shall be for a subscrip¬ 
tion to the Official Journal There shall be the following exceptions: 



CONSTITUTION AND DV-LAWS 


331 


A. Two Members of the Institute who are husband and wife may elect to 
receive one copy of the Official Journal between them, u'hcn their dues 
shall each be reduced by twenty-five percent. 

B. Any Member may make a payment m place of all succeeding annual dues 
baaed on a suitable table and rate of interest specified by the Council. 

C. Any Member on active military duty may notify the Treasurer that he 
wishes neither to pay dues nor to receive the Official Journal during the 
current year. He may receive the official Journal for the suspended 
years on payment of one-half of the suspended dues within one 3 mai' after 
resuming payment of annual dues. 

D. Any Member who resides outside the Western Hemisphere .shall paj' five 
dollars annual dues. 

Institutional Members shall pay annual dues of at least $100, For each $100 
of armual dues, an Institutional Member shall receive two copies of the Official 
Journal, one bound, and shall be entitled to designate one person to have the 
full prerogatives of a member without further payment of dues (including the 
receipt of a personal copy of the Official Journal). Twenty-five dollars of each 
$100 shall be allocated to the three subscriptions to the Official Journal and the 
binding of one copy. 

Annual dues shall be payable on the first day of January of each year. 

It shall be the duty of the Treasurer to notify by mail anyone whose dues are 
six months in arrears, enclosing a copy of this article. If such person fails to 
pay such dues within three months from the date of mailing such notice, the 
Treasurer shall report the delinquent to the. Council, who may suspend the 
delinquent from membership and who may reinstate the delinquent upon paj''- 
ment of arrears. 

AKTICLE 3 

SaIuVBIES 

The Institute shall not pay a salary to any Officer, Councilor, or member of 
any committee. 


ARTICLE 4 
Amendments 

These Bylaws may be amended in the same manner as the Constitution or, 
if the proposed amendment has been previously approved by the Council, by a 
majority vote at any regularly convened meeting. 



JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 

DECEMBER, 1948 
A Hides 

Commercial Uaes of Sampling J. Stevbns Stock A>ft) Joseth K. Hochstim 
Variation of the Frequency of Fatal Quarrels with Magnitude Lewis F, Richardson 

Bank Reserves and Business Fluctuations. Clark Warburton 

The Ordering of n Items Assigned to k Rank Categories by Votes of m, Individuals 

Garret L Schutleh 

Levels of Significance for Variance Ratio of Two Samples of Equal Size 

C. J. Kirchen 

Main Effects and Interactions . . . . . . D. J. Finney 

A Test for Symmetry in Contingency Tables . Albert H, Bowkbr 

The War Production Board’s Statistical Reporting Experience, Part IV 

David Novice and George A. Steiner 
Correction to "On Estimating Precision of Measuring Instruments and Product 

Variability” . . Frank E. Grdbbs 

Statistical Methodology Index . . Oscar Krises Bitbob 

AMERICAN STATISTICAL ASSOCIATION 
1603 K Street, N. W., Washington 6, D. C. 


MATHEMATICAL REVIEWS 


A jaurnal containing reviews oj the mathematical Uler~ 
ature of the world, with full subject and author indices 

Publication of this journal is sponsored by the American Mathe¬ 
matical Society, Mathematical Association of America, Institute of 
Mathematical Statistics, London Mathematical Society, Edinburgh 
Mathematical Society, Union Matematica Argentina, and others 

Subscriptions accepted to cover the calendar year only. 

Issues appear monthly except July. 520,00 per year. 


Send subscription order or request Jor sample copy to 

AMERICAN MATHEMATICAL SOCIETY 
531 West 116tb Street) New York City 27 





ON THE THEORY OF SYSTEMATIC SAMPLING, II 
By William G. Madow’^ 

Institute of Statistics, University of North Carolina 

1. Summary and introduction. In an earlier paper/ [1] an approach to the 
problem of systematic sampling was formulated, and the associated variance 
obtained. Several forms of the population were assumed. The efficiency of the 
systematic design as compared with the random and stratified random design 
was evaluated for these forms. It was remarked that as the size of sample in¬ 
creased the variance of a systematic design might also increase, contrary to the 
behavior of variances in the random sampling design This possibility was verified 
in [2]. 

One approach to the study of systematic designs, given by Cochran [3] removed 
this difficulty to some extent by changing the problem to one of the expected 
variance, and supposing the elements of the population to be random variables, 
He shoAved that if the correlogram of these random variables is concave upwards, 
then the expected variance of the systematic design would be less, and often 
considerably less, than the variance of a stratified design. 

In the present paper the results of the earlier papers are extended to the sys¬ 
tematic sampling of clusters of equal and unequal sizes. Some comments on 
systematic sampling in two dimensions are included. 

In section 2 we derive two theorems that have considerable applications in 
many parts of sampling. Although it has been common for people working in 
sampling theory to tell each other that these theorems ought to be true, yet no 
reference seems to exist. 

In section 3 we develop the implications of a remark [1, p 13] that in designing 
sample surveys we should try to induce negative correlation between strata. In 
Theorem 3 we obtain sufficient conditions for the correlation to be negative. 
The lemma and Theorem 4 given in Section 4 enable us to extend the uses of 
Theorem 3 in practice As an application of these results, we show that if a 
population has a concave upW'ards correlogram, and if stiata are defined m an 
optimum fashion for the selection of one element at random from each stratum, 
then we can define a systematic type design that will be more efficient than 
independent random selection from each stratum. 

In sections 5 and 6 Ave obtain various results m the systematic sampling of 
clusters largely as applications of the more general theorems of the earlier sec¬ 
tions. In general the results are of a nature similar to those of [1] and [3] in that 
the formulae shoAV the conditions under Avhich systematic sampling may be 
expected to be more efficient than random or stratified random sampling We 
have not, hoAAmver, applied these formulae to specified types of populations 

^ Submitted foi publication, November, 1948. Parts of this paper were prepared while 
the author was Visiting Professor of Statistics at the University of Sao Paulo, Brazil 

“ liefercncea to the aiticles and book cited are given by Roman numerals 

333 



334 


WIIiUAM Q. MADOW 


From [1, 2 aud 3] it is already apparent that this work will be useful and such 
studies should be more valuable when made in connection with important types 
of surveys or data than when made as illustrations in a general paper. 

2. Random events and conditional expectatioas« Almost invariably, samples 
are selected in several stages. For example, to select a sample of households from 
a city one frequently used method is the following two stage sampling plan; 

a. A map of the city showing the location of each block is obtained and 
brought up-to-date. 

b. Usmg this map, a sample of the blocks of the city is selected (this is stagel). 

c. From the households on the blocks selected in stage 1, a subsample of house¬ 
holds is selected (this is stage 2,). 

In this section, we give a general approach for evaluating the means and 
variances associated with multi-stage sampling. This approach has the ad¬ 
vantage of at once yielding the contributions to the variance arising from 
each stage. Furthermore, the theorems presented are useful in calculating vari¬ 
ances even when our interest is not in multi-stage sampling. The theorems are 
presented in general terms because of their wide application in sampling. 

We shall say that the result of performing an operation is a random event A* 
if the result can assume m possible states Ai, • • • , A « with probabilities pi, ‘ , 

Pm, where 

P[A* « A<} = pf, £ P( =* 1, 

1-1 

and P(A* = Ai) is read “the probability that the random event A* assumes 
the state A< 

One illustration of an operation is the operation of selecting a sample of blocks. 
If there are N blocks in the city of which we select n in such a way that each 
set of n of the N blocks is a possible sample, then there are Cn possible samples. 
In this case m = 0” and the Cn possible samples are the m states of A* “the 
result of selecting the sample of blocks.” Furthermore, if each of the possible 
samples of blocks is equally likely to be selected, then 

The random event A* may also be the taking on by a random variable of 
one of its possible values. If 2 * is a random variable having possible values 
zi, ■ , Zm with probabilities p\, pm then we con define the states of 

A* to be Ai where A< is “ 2 * = 2 < 

Thus the notion of a random event includes the two types of randomness 
that are met in selecting samples. 

Let x' be a random variable. Then, by the conditional expectation of x' subject 
to the random event A* is meant the random variable B*(x' (A) whose possible 
values are E(x' \ Ai), i = 1, • • • , m and whose probabilities are p,, that is 

P{E*W 1 A) = E{x'\Ai)} = Pi = P(A* = A.), 



SYSTEMATIC SAMPLING 


335 


where 

(2.1) E(x’ 1 A.) = 2 

j-1 

Xij is the jth of the Nt possible values of x' when A, occurs, and 

p,{A,) = P[x’ = x,j I A,} 

is “the probability that x' = xn given that A< occurs.” It should be noted 
that if 

p,y = P{x' = X,,}, 

then 

p,j = P{x' = x,j, A* - A.} 

since the fact that x' = x,j implies the occurrence of A,. Then 

(2.2) prP,(A0. = P.j. 

We state Theorems 1 and 2 without proof since their proofs are immediate. 
Theorem 1, The expected value of the random variable E*(x' \ A) is E x', i.e. 

E{E*ix' I A)} = Ex'. 

By at'i/’iA we shall mean the random variable whose possible values are 
v.'k-m, ,i= 1, • ■ • , m where 

= ®l[x' - Eix' 1 A,)] W - E{y' 1 A.)] ] A.} 

and 

P[<Tx'y'\A — O'i'v'l.i'} ^ Pt — BjA* = A,), 


i.e. 

aU'\A = E*{[x' - E*ix' 1 A)] W - E*{y' 1 A)] 1 A). 

Furthermore, the symbol stand for “the covariance of the 

two random variables E*ix' | A) and E*(y' \ A) ” The corresponding definitions 
of variance are obtained by replacing y' by x' above. 

Theorem 2. If x' and y' are random variables, then 

ffx’y' — Eax'v'\A + 

and 

al’ = Eal>\A + dx>^x'\A) • 

We note that, since the , p, and p^(A,) are not specified, Theorems 1 and 
2 are valid for any two-stage plan. The generalizations of Theorems 1 and 2 
to multi-stage plans are obvious, but in practice it often turns out to be simpler 
to apply the theorems several times. 



336 


WILLIAM 6. MADOW 


It would be easy to give applications of Theorems 1 and 2 but tliese are not 
essential for our purposes in this paper. As remarked in the introduction, these 
two theorems have long been part of what we may call the folklore of sampling 

3. Stratified sampling and negative correlation, with an application to syste¬ 
matic sampling. In discussing plans for sampling from a stratified population 
it is customary to suppose that if x' is an estimate and x' = x'l • + x'c 
where x'j is the contribution to x' arising from the jth. of the L strata, then the 
sampling is to be so done that the random variables x[ and x'j, j 9 ^ i, are inde¬ 
pendent 

In [1, p. 13] it was noted that if a population were stratified, and if the elements 
were so selected that the contributions from different strata were negatively 
correlated, it would follow that the variance of the estimate would be less than 
if the contributions were mdependent but had the same covariances within 
strata, This was, of course, an immediate conclusion from the fact that 

L 

• ,>-l 

and, hence, if 

(3.1) C = L < 0 

then is less than it would be if C = 0. If C < 0 wo shall say that the sample 
design has “negative correlation.” 

It IS obvious that any population may be taken to be itself a sample, a sample 
from the possible populations that might have been produced by the forces that 
determined the existing population. Inasmuch as sampling designs are often 
chosen on the basis of a knowledge of the dominating forces and some past 
experience, it is realistic to consider not only the expected values and variances 
for a specific population but also their expected values over all possible popula¬ 
tions determined by the same forces. Cochran [3] has given one illustration of 
the usefulness of considering the expected variance of a sample design. He 
considered the elements xj, • • • , Xn of the population themselves to be random 
variables and supposed that' E Xj = n and E{xi - nf - v . For his purposes 
it was also convenient to suppose that if u > 0 then E{xi - n) (x,-+„ - a) = 
Pucr. It was then possible for him to make realistic hypotheses concerning the 
correlogram, i e. the p„ considered as a function of u, that rvould not have been 
reasonable in dealing with a specific population. He thus obtained general 
conclusions concerning the expected efficiency of systematic sampling designs 
as compared with random and stratified random designs 

In this paper we shall consider not only the expected values and variances 
for the given finite population but also the expected values of these expected 
values and variances under the assumption that the elements of the population 
are themselves random variables. We shall use S to denote the expected value 



STSTBMA.TIC SAMPLING 


337 


considering tlie elements of the population to be random A’-ariables and as before 
use E for expected values based on the specified finite population. 

Then 

Sffj' = 23 x\ , 

«.j—1 

and if < 0 we shall say that the design has ‘expected negative correlation ’ 
We now propose to obtain the beginnings of an approach to sample design 
when it IS possible to introduce or take advantage of negative correlation or 
expected negative correlation through the sample design. 

To simplify, we shall begin by considering two strata and shall suppose that 
the possible values of x' are xi, • • • ,Xn wliile the possible values of y' are y\, • • • , 
yn ■ Furthermore, we shall suppose the sampling to be so done that 

P{x' = x^\ = P{y' = y,] = P[x' = x,,y' = y,} = pi > 0, 

n 

so that 23 Px = 1 a^nd P{x' = Xi,y' = y,\ =0 if i 9 ^ j. 

1-1 

Under the above assumptions, it follows that 

n n 

(3.2) (Tx'v' = 23 P.a!.2/, - E PyV3!>:,yi . 

x-l t,)-l 

The symbol > 0 means that > 0 for all ^ and j and > 0 for at least 
one pair i, j. We shall say that if (», — xj) {yt — yi) ^0 then the sets (k) and 
(y), where {x) stands for aii , • • , Xn and (y) for yi , • • ■ , y™ are similarly ordered 
and if (x, — X/) (y, — y/) <0 then these sets are oppositely ordered. Then it 
is easy to prove, [4, p. 43] directly that if the values are oppositely ordered, then 
Oi'y' < 0 and if they are similarly ordered then > 0. 

A somewhat more general result is the following: 

Theorem Let n < k , let 

n K 

6 = 23 23 10. 

.-1 3-^ 

he a real bilinear form , and let 

n 

i = 23 0 ’,iW{ 

i—J 

n fy 

be a real linear fornij where Wi > 0, > 0 and ^Wi — S = 1. 

»-i 1-1 

Then a snfficimt condition that h > t is 
(3i3) ^ ctij ■ 

If k = n and w^ = then b > t if 
(3.4) Oij ”1“ fl'yi ^ ctii "f" ajj , 



338 


■WILLIAM G. MADOW 


Pboop. Since 

h - t = J^ - Wi) + 2] a,iW{Zj, 

1-1 

and since 

k 

1 — «. = 2i, 

r^\ 

it follows that 

i - i Y, (an — au)v),Zj. 

Hence, b > t if (3.3) holds. Also, if fc = n and to* = Zj then b > t if (3.4) holds. 
Some obvious generalizations of Theorem 3 have been omitted since 'we do not 
need them. 

To obtain the result that < 0 if the sets (x) and (y) are oppositely ordered, 
we make the identifications a,j = xiyj and z,- = w, = pi. Then (3.4) holds and 
substituting we have 

(3.5) a.. + ajj - a.y - a„ = (x< - xj) (yt - ys) 

so that if the values are oppositely ordered, (r,-y' < 0i hence the two strata 
have negative correlation. 

To consider expected negative correlation we note that 

n n 

(3.6) &(rx'y' = 23 Pi Cl* + 23 PiVi^iS 

<-l i,;-l 

where we suppose that €x, = Mi %*• = v and 

&{Xi — n) (yy — v) = <t,i 

so that in this case v,, is a covariance, not a variance. 

If we put a.y = (T,y and z,- = w, = p., then (3.4) holds and we obtain, as 
sufficient for 6 KTi>y' to be negative, that 

(3.7) Cyy + (Ty, > <r,; + Cyy 
or, it we define p,j by the equation, 

™ (Tij j 

where <r’ = 6 (x, — pf and al = &(y, — v)*, we have 

(3'8) p,y + py, > p„ + pyy 

as a sufficient condition for Ss-j-y- < 0. 

Let us consider the systematic sampling of single elemen'ta. In systematic 
sampling, we assume a population of kn ordered elements xi, Xj, • • • , x*, 
®i +*, • • • , X 2 k, ■ ■ ■ , xi+(„_i)j;, • • • , x„n, of which we wish to estimate the arith- 



SYSTEMATIC SAMPLING 


339 


metic mean x. As our estimate we use 

= (aii + • • ■ + x'„)/n 

where xi is selected at random from Xi, •• • ,Xk and )ixi = x, then x[ = x,+(,-i)k, 
i = 2, ■■■, m. Thus, x' may be interpreted as an estimate based on a stratified 
population, the ith stratum consisting of 




and 


while 


Then 


where 


■P[xt — — P[Xj — (i—1)1;, Xj — 1/Aj 

P{Xt “ 1)1: , X] “ “* 0, if CJ 7^ 


“ (0 5 


Xa+(t—l)k‘Xa-i.(]—l)k Xiij 


■l)k- 


Hence, any two strata that are oppositely ordered will yield a negative contribu¬ 
tion to the variance. However, since it is not possible for all strata to be nega¬ 
tively ordered, we do not thus obtain a useful result and must return to the 
consideration of C or al' itself as was done in [1]. If, however, we make Cochran’s 
assumptions, and consider So-x-v', it follows that for the fth and jth strata 


Pafi = P(,i-i)k+f-a , 

and (3.8) becomes 

(3.9) p(,_,•)*+ 0»_a) + PO-0*+Ca-« > 2p(y_,)t , 

i.e. the correlation function pu must be concave upwards, which Cochran showed 
by other means. By considering &C it is possible to show that a sort of average 
concavity is all that is required of the correlogram for systematic sampling to 
have a smaller variance than stratified random sampling. 


4. Conditions for negative correlation when the strata are of unequal sizes 
with an application to systematic sampling. Often, as in the systematic selection 
of clusters with probability proportionate to size (discussed in Section 5) the 
simplified situation dealt with in Theorem 3 does not directly apply. However, 
Theorem 3 may be used to advantage by the following device. 

Let us suppose the possible values of x' to be ■ , x„ and those of y'a to be 
y\, • ■ ■ iVk ,h > n and let 

PW = yl\^' = = Puio 



340 


WILLIAM G. MADOW 


BO that if we define 


(4.1) 

k 

2/a ~ 2/sPPI« ) 

then 

ya = E(y'o 1 x' = Xa). 

If we define y' to be a random variable having possible values i/i , ■ ■ • ,yn 
with probabilities pi, • • ■ , Pn where 


Pa = P{x' = a;,) 

it follows that 

y' = E*{y[ 1 x') 

and 


Clearly, Theorem 3 is valid for the random variables x' and y'. 

Consequently, we need only determine what restrictions the conditional 
probabilities, poi<,, and the values, i/a, need satisfy for the setsa)i, • • • , a:„ and 
2 / 1 , •' • , 2 /n to be oppositely ordered or for (3,7) to hold. 

Substituting for 2/1 and yj in (3.5) we see that if 

(42) 

k 

{Xa - Xy) £ yl{pf\a - Pg\y) < 0 

fl-l 

Let 

< 0. 


£ra7 = ^Xa — u) iy\ — v). 

Then substituting in (3.7) we see that if 

(4.3) 

k 

or if 


(4.4) 

k 

ipp\a "" 'Pp\y)(,pc(^ Pyp) ^ ® 

^—1 

then 

^O'ai'yQ 0« 

In order to use (4.2) and (4,3) the following wcll-knoAvn lemma is often useful. 
Lemma, 7/ < I 2 < ■ ■ ■ :< f* < 0 and the quantities ti , • • • , tit are such that 


4 

S ej > 0 
/i-i 



SYSTEMATIC SAMPLING 


341 


then 


« 

S ?/3 ^ 0, S = 1, • • ■ , fc. 
|3-I 


Let us use this lemma to obtain another theorem that will be helpful in showing 
negative or expected negative correlation between strata. 

Theorem 4. Let b be a bihnear form 


n m 

1j ^ ^ ^ 

a-1 j-1 


« a' 

such that 2) > Oj 2 iSj S! 0, 5 = 1 , • • • , n — 1, — 1 , • •' , ?n — 1, and 

1^1 j-i 


(4.5) 


•n m 


1-1 j-1 


= 0 


Let 


= flij* — . 

Then a sufficient condition that h < 0 is 5,^ < 0. 

Proof. Upon substituting for Wn and Zm . in 5 from (4.5) we see that 

n—1 tn—1 
4-1 j-1 


where 


or, if we define, 


5,'j — Qtij dim dnj “1“ dnm 


n«»-1 


then 

m—1 

b = 22 {.«/. 

According to the lemma, it then follows that a sufficient condition that 6 < 0 is 
that 

fi < ?2 ^ ^ Sm-l ^ 0. 

Also, a sufficient condition that 

- k,+x < 0, 


Sij — ^ 5<+i.y ~ 5,+i,j+i 


is 



342 


WILLIAM G, MADOW 


Then to complete the proof it is only necessary to verify that 
Sij = Sij — — S<+i,y + 5<+i./+i. 

In the preceding pages we have given an identification of systematic with 
stratified sampling where, instead of the selection being made independently 
within strata, the choice of an element from one stratum determines the choice 
from the other strata. In this identification, however, it was assumed that the 
strata contained the same number of elements. Let us now extend this method 
of selecting samples to the case where the strata have different numbers of 
elements. In so doing we shall illustrate the use of the above lemma and theorem 
4. 

Suppose now that the population consists of N elements Xi, • • ■ , Xn classified 
into n strata, the ith of which contains the Ni elements 


Xjfi+ , • • • , a:v,+...+Arj. 

We shall denote these elements by la , • • • , a:,w, . 

We shall select one element from each of these n strata. The element selected 
from the ith stratum is written . As the estimate of 5, the arithmetic mehn of 
the population, we use 


£' 



I 

X, 


and it is well known that if the selection is made independently at random from 
each stratum, then 



where <r? is the variance of xi, i.e the variance of the ith stratum. 

Let us now consider an alternative to the usual method. We can suppose that 
Wi > 1 without any loss of generality. (The methods are the same for any stratum 
having JV, = 1 and will also yield the same result for any population such that 
either all the iVj = 1 or all but one of the W, = 1. Differences occur if at least two 
of the Ni differ from 1.) 

We first choose an element at random from the first stratum. Suppose that 
xi = Xa. Then to choose an element from the second stratum, assuming that 
N 2 > 1, we proceed as follows: Multiply Ni by any positive integer such that 
Niti/Nj ia m integer, say, ki. Assign to each element of the second stratum the 
measure of size , and form the two sets of cumulative totals U ,2ti, • • • , Nih 
and ^ 2 ,2fcj, • > • , Wife . Then with the measures of size U assigned to each element 
of stratum 2, and the measure of size fe assigned to each element of stratum 1, 
it follows that strata 1 and 2 have,the same total size. 

As an example of the arithmetic given below consider the following simple case. 
Suppose that Wi = 3 and N 2 = 4. Then if we take for <2 the value 6, it follows 
that fe = 8. We choose one of the integers 1, 2, 3 with equal probability. If the 



STSTEMATIC SAMPLING 


343 


integer 1 is obtained, we have selected the first element of the first stratum and 
choose an integer between 1 and 8 with equal probability. If the selected integer 
is between 1 and 6, the first element of the second stratum is selected. If it is 
7 or 8 the second element of the second stratum is selected. Similarly if the 
second element of the first stratum is selected, then we select an integer between 
9 and 16 with equal probability. If that integer has value 9, • • • , 12 the second 
element of the second stratum is selected; if it has value 13, - ■ ■ , 16 the third 
element is selected. 

The general formulation of the selection procedure for the second stratum is: 
Suppose that is the smallest integer such that (a — 1)^2 + 1 < j3o<2 and that 
is such that (di — 1)^2 < cth < jSife. Choosen an integer at random from 
1, • ■ • , fc 2 and call that integer /3. Then, if 

(a — 1 )A :2 < (a — 1 )A :2 + |3 ^ fiok 

the ^th element is selected from stratum 2; if 

fiok < (a — l)lk2 + d < (do + l)^t 

the (do + l)th element is selected; • • • ; and if 

(di — 1)^4 < (<* — l)kt + P < akt 

the dith element is selected from stratum 2. 

It is easy to verify that when the sample is so selected, each element of stratum 
2 has equal probability of being selected. Hence, if we apply this procedure to 
each stratum we have 



Let us evaluate for this type of selection. Now 

= E {x'i ~ *<)(*/ — */) 

where ii is the arithmetic mean of the elements of the fth stratum. From Theorem 
2, we then have 

V,;*; = w ^ E[xi -E{Xi 1 - E{xi\xiu] 

JSl e*-! 

+ £ [Eix't I xr„) - smx'i 1 a!u) - */]• 

It is easy to see that the method of selection used above implies that the first 
term of vanishes. Furthermore, it, is the arithmetic mean of the conditional 
expectations so that we have reduced the problem to one of determining whether 
the conditional expectations satisfy the conditions for negative correlation or 
expected negative correlation. 

If we denote E (x', | xi„) by y.a , then we need to see whether the sets ya •• • , 



344 


WILLIAM G. MADOW 


l/iNi s-nd 2/ji j 


where 


■ , Viifi arc oppositely ordered. Now 


( 2 / 1 “ Ujp 


If I 

B-l A-1 




€»0aS ‘— I .Tltt} ’ ”* ^ip | 1 • 


If a < i8 then, according to the method of selection, 


^ j €ioa^ — 0, ^ I)**') I 

o“J 

while 

A'l 

0. 

fl-1 

In Theorem 4, we then make the identifications n = Nt, tn = Nj, 

XUg — iiQafi , 3/i “ €jha0 Slid " ^tp ^jh • 

Then 

^qli ” (^ip " ^t,g'i'X){p^jh ‘ 

and hence to have negative correlation between the strata, it is sufficient that 
the sets x,i, , x,n, and a:,,, • • • , x,Kj have the type of negative ordering 

represented by 8gh < 0. Similarly, if 

ffgh — Mj)» M» ~ ^Xtg , 

then, for expected negative correlation, it is sufficient that 

<fph ~ <ra,h+l ~ Vp+l.A 4“ dj+iifc+i < 0. 

Of course, these conditions will be satisfied if a concave upwards correlogram 
exists. Hence, if a population consists of N random variables Xi, • • ■ , xif having 
a concave upwards correlogram, then, no matter into what strata these elements 
are classified, provided that the order of occurrence of the elements remains un¬ 
altered, the systematic selection of the elements in the sample can be so planned 
as to yield an estimate having smaller variance than the stratified random selec¬ 
tion of the elements in the sample oven if optimum allocation is used. If more 
than one element is being selected from a stratum under optimum allocation, 
then the systematic selection of the same number of elements will suffice. If not 
only optimum allocation but also optimum definitions of strata are being used 
so that but one element is selected from each stratum, then systematic selection 
according to the scheme described in the section will produce a variance not 
larger than the variance of stratified random sampling. It should be noted, how¬ 
ever, that this does not imply that a ‘hammer and tongs’ use of systematic 
sampling ignoring the strata will produce a smaller variance. There is work to 
be done on what is required for the latter to occur. 



SYSTEMATIC SAMPLING 


345 


It may be noted that the procedure of this example provides an answer to the 
systematic selection of elements from a population whose size is not a multiple 
of the size of sample. 


6. The systematic sampling of clusters with probability proportionate to a 
measure of size. It is known [5] that sampling clusters with probability pro¬ 
portionate to a measure of size often yields considerable reductions in the variance 
of the estimates. However, the theory of the systematic selection of several 
clusters Avith probability proportionate to a measure of size has not been worked 
out, and it is the purpose of this section to make some contributions to that 
theory. 

The most frequently used method of sampling clusters with probability pro¬ 
portionate to size IS equivalent to the following. Suppose that the clusters are 
denoted hy Ci, ■ ■ ■ , Cm and that to the hth. of these M clusters is assigned a 
measure of size Ph ■ Form the successive totals Pi, Pi + P 2 , Pi + Ps + P 3 , ■ , 
Pi + ■ ■ ■ + Pjir. If we wish to select m of these clusters, we calculate P^ = 
(Pi + • • + PM)/m. Then, assuming that P, , j = 1, • • • , M, we select 
an integer with equal probability from 1, • • , Pm . Calling that integer P', we 
calculate the m numbers P', P' -j- Pm, P' + 2 Pm, • • • , P' -f (m — !)?„ 
If 

(5.1) Pi + • • • + Pn + 1 < P' + (z - l)Pm <P,+ ■■■ +Ph 

for any integer i, i = 1, • • , m, then the duster C;, is selected for the sample. 
Any cluster for which Ph > Pm is automatically included in the sample, and if 
there are, say, a such clusteis, then we calculate Pm-a for the M — a clusters 
remaining after including these a m the sample, and proceed as above. 

In deriving the variance of the estimate we shall use, we interpret that estimate 
as a stratified sampling estimate Although it is easy to obtain the expected 
value of the estimate without that interpretation, we shall need it later in the 
derivation of the variance, and hence we give it here to shorten the total presenta¬ 
tion a little. 

Suppose that clusters Ci, ,Ci,i are such that 

Pi + • • • -1- P^l-l < Pm ^ Pi 4- • ■ + Pii • 

Then we define stratum 1 to consist of clusters Ci, • • ■ , Ck, ■ It is easy to see 
that if the above sampling method is used then 


P{Ch is selected from stratum l,h < h} = 


P\Cki IS selected from stratum 1 } 


Pm - Pi 




Furthermore, suppose that clusteis Cij, ■ • • , are such that 

Pi + • • • + P/.i-t-*2-l 2 Pm < Pi + ■ ■ ■ + Phi+ki 



346 


WILLIAM a. MADQW 


Then we define stratum 2 to consist of clusters Cki, , Ck^+in • It is easy 

see that if the above sampling method is used, then 


P{Cki is selected from stratum 2} 


•Pi + • • • ■+■ P*i — Pm 

-fS-> 


PlCt ,+4 is selected from stratum 2, 1 < h < &a} 



P[^ti+»i is selected from stratum 2} 


2P„ 



• ~ Pti+ti-i 


Since P» < Pm we remark that it is impossible that Ct, be selected from both 
stratum 1 and stratum 2. 

Injgeneral, if clusters Chx+...+ki -\, * • • , Chi^-.+ki are such that 

(5.2) 

■•+*■<—1 '^Pm ^ Pi "1“ ■ ■ • "t* PfciH-(-*1 


then the ith stratum consists of these fc, + 1 clusters, and we define the probabil¬ 
ities P(«, « = 0, • • ■ , fc,, by the equations 


P« = P(C'*,.|.. is selected from stratum t) 

P*,+..,^.|b<_, —({ — 1)P,H 
- - -, 


(5.3) 


Pia = P{Ck is selected from stratum*, fci -1- • • • -b fc,_i <h <ki+ +ki} 



a = h - k, ■- ■ ■« — 


Pik, = P{Cii+.. + 1 , is selected from stratum i} 

_ '^Pm ~ Pi ~ • • • — Pi,+ . +fci-l 

K - 


We remark that 


(5.4) Pv-ii, -f P« = . 

X m 

Now, let the elements of the population be a;*/, A « 1, ■ • • , Jlf, j « 1, • ■ • , 
AT^, and let the arithmetic mean of the Ath cluster be denoted by . Since the 
AT k are usually unknown but the measure of size, P* , is known, we sample, not 
with probability proportionate to the Nk, but with probability proportionate 
to the Pk. We shall denote the clusters of the ith stratum by C* , • • ■ , C,*,, 
making the identification 

(5-5) C,. » . 

Furthermore, the number of elements of the clusters are denoted by Wa , • • • , 



SYSTEMATIC SAMPUNQ 


347 


N,k,, and the means of the clusters by , • • • , Xtk,, where 

(5'6) Xia — Xa+ki+.,,+ki-i 

SO that ~ and N to “ -A^i— 

Furthermore, we define 

(5.7) Xta “ ^ta X^a/^*a “ ^a+fci+*'*+fc<-l 
We define the mean of the ith stratum to be 

(5.8) "x, = ^PtaXia/Pm, 

a-0 


and the variance of the ith stratum to be 



Then, if the mean and variance of the population are defined to^be 

( 5 . 10 ) i = i PhU/P 

»-i 

and 

(5.11) = x^f, 

it is easy to verify that 

(5.12) i = -Zf, 

m ,-i 

and 

1 ffl "I ^1* 

(5.13) = — £ v! H-£ ('r, — xf. 

m ,_i m ,-i 

An unbiased estimate of the total of a characteristic. We shall see that we can 
obtain an estimate of x, where 

m Hi 

X = '^^Xi, 

•-1J-1 


i. e. X is the total of the elements of the population. Since N is unknown, the 
estimate of x that is used is the ratio of unbiased estimates of x and N. It is well 
known that this ratio is usually biased. Since we are not making any study of 
ratio estimates here we will not derive the approximation to the variance of this 
estimate. It may be remarked that it can be obtained by a simple extension of 
the results here given. 



348 


WILUAM G. MADOW 


Let us agree that the general form of the estimate will be as follows: 

If the /th cluster of the population is selected we shall subsample rij elements 
from it. The total of the values of the characteristic for these n,- elements we de¬ 
note by luj Furthermore, we denote by ni the total number of elements sub¬ 
sampled from the fth stratum, or, what is the same, from the cluster selected 
from the ith stratum; and by x[' the total of these elements. Thus, if the jth 
cluster is the itli selected, then n'i = n, and x" = x',. We define our estimate x" 
of X, the total of the population, to be 

(5.14) x" = K(xi -b • • ■ -h a;«). 

Then, if Z = P/mn and = nNh/Ph, it is easy to see that x" is an un¬ 
biased estimate of x. 

The variance of the estimate. We may calculate the variance of x" where 

(5.15) x" = Pm (xi -b • • • + ®m) and x! = x'l/n. 


Now, by Theorem 2, 


(5.16) 


a m 2 !ii . a 

ITxii — -f- 


1^) 


where A* has been defined above We shall not evaluate -So-** |a since this in¬ 
volves no new problem for subsampling methods using random or systematic 
methods, or methods using probability proportionate to size. 

From (5.15) it follows that 


(5.17) SV'IA) =P„(5i'-b ••• + Jl) 


or, m other words, E*{x"\ A ) is the estimate we would have if the clustem in the 
sample were completely enumerated We shall denote the second term of (5.16) 
by o-fl . Then, 


(5.18) 


a 

CTb 


= 

^ m 


H 4: + E 


Now 

(5,19) 


_ki p 

2, « \2 3 

^ — O'i . 

a —0 ± tn 


To calculate o-iij; ,i^j, we shall use Theorem L 

(5.20) ffjjs} = E(xi - f,)(5; - xj) = E[{$'i - Xi)E*[{x^ - x,) iSi). 

To calculate E*[{xf — xf) \ x[] we begin by noting that 

(5.21) E*[{x, - t,) [ xi] s E*[(xi - f,) 1 (7.1 

where C* is the random event having /c< -b 1 possible states which are the selec¬ 
tions of (7,0, ■ • , C,ki as the sample clusters of the -ith stratum. Now if C{a is 
one of the clusters of the zth stratum let us calculate 

(5 22) f,) 1 C.J. 



SYSTEMATIC SAMPLING 


349 


We begin by determining 'Oi^hich of the clusters of the j'th stratum are possible 
sample clusters, if we Icnow that C,® is selected from the fth stratum. Since the 
sizes of strata i and j are both Tn it follows that there exist integers /3o and |3i 
such that 

■PjO + • * • + P,30-1 < P,1 + • • • + Pi,a-1 < Pjl + • • • + P,3o , 

and 

P,0 + • • • + P/3i_i < Pjl + • • ■ + P,a < Pyi + • • • + Pj3i . 

Hence, if we know that C^a has been selected from stratum i, it follows that we 
must select one of the clusters 

from stratum j and 

PIC ,3 is selected 1 C,* is selected} = Pj^/P,. , ;S = ft , /3o + 1, • • • , |3i 

= 0, otherwise, 

where 

P^o = Pjl + • • • + P,3, — Pa — • ■ * , Pj.a-i 
Pw = P,3,/3 = ft+ 1, ••• ,Pi - 1 
Pj@l = Pil + • • • + Pta — Pjl — • ' • — P,3i-1 , 

and 

fli 

P if — Pm 

( 3-^0 

Then 

(5 23) E[{x', - f 11 C,.) = - X,] 

where 

(5.27) = 

f-fO Cia 

Hence, substituting in (5.20), we see that 

= E{x[ — ■x,)(xJi, — X,) 

where 2/i» = Xjia if Cta is selected from stratum i. Then it follows that 
(5.25) (®.a — 'S',)(x,ia - '^y). 

a—0 

Obviously, the conditional expectation can be eliminated fiom (5.25) by using 
(5 23) but no gain in simplicity or generality thus occurs. 



350 


WILLIAM Q. MADOW 


It would be possible to obtain the variances and covariances of the a:,' by 
listing all possible samples in any'special case. To ncake this general would only 
require writing the necessary notation. 

Substituting in (5.18) we see that 

<ra = Is 

l.-i J 


where is given by (5.25). 

_ 

It follows that if we use the fact that £ (x i — x) =0, then we have 


3 

(Tb 


m ci u, 

a*0 •* m 


imif 


(Sfot "" x)(xy|a 


or, returning in part to the "unstratified” notation 

(5.26) «ri = — Z ^ (zfc - x)’ + — S E 4“ (»•'- - - *)• 

m h-i r m .yy a-o Jr 

By combining terms of the second part of (4.26) generalizations of the formulae 
obtained in [1] are easily obtained. 

Still another means of writing <rl is 

(5.27) ffi = — |(t’ — Z ^ (*<« — 

m [ %^i a-o F } 

where 


1 

0*6.«. 


-it {3,-if. 


m i-i 


which shows both sources of changes in efficiency as compared with sampling 
with probability proportionate to size, and replacing the clusters obtained. (It 
is, of course, obvious that /m is the variance of E*{x" \ A), if we assume the 
m clusters to have been selected with probability proportionate to size, each 
selected cluster being replaced before the next is selected.) 

By considering (5.26) and (5.27) it is clear that systematic sampling with 
probability proportionate to size will be more efficient than sampling p.p.s. 
with replacement under much the same conditions as when we sample single 
elements. The details are omitted. They depend on appl 3 ring the Lemma and 
Theorem 4. The summary of the conditions is: If we sample S 3 ^tematically with 
p.p.s., and if the two sets Xa, • • • , x,t< and ya, • • ■ yjkj are monotone, one being 
monotone non-icreasing and the other monotone non-decreasing, then the covari¬ 
ance between the tth and jth strata will be negative, and thus gains made as 
compared with independent sampling from the strata. 

If we define 

(Tmfi = &(Zia — Sx,o)(Xy# — &Xjf) 



SYSTEMATIC SAMPLING 


351 


then the concavity condition for systematic sampling p.p.s. to yield a smaller 
variance than independent sampling p.p.s. from each stratum is, if a < |9, 


0 0^0 0 ^ 
Vol — <r s n a2 ~ O' 72 


’’akf ~ O'tkt S 


< 0 . 


6. The systematic sampling of clusters of equal sizes. Let us now suppose 
that our population consists of clusters of elements, the clusters being of equal 
size, i.e. containing the same number of elements. To be specific, let the popula¬ 
tion consist of M clusters, where M = cm and each cluster contains N elements, 
where JV = fcn. Then, the value of the characteristic being measured for the 
ath element of the fth cluster may be denoted by , and the total of all the 
elements of the fth cluster may be denoted by z.. The arithmetic mean of the 
population is x, and thus 

Mi = T'.it. 

where 


N,xa = X ,, 

a. Complele enumeration of duaters in sample. First, suppose that we wish to 
estimate i by where i' is the arithmetic mean of the sample obtained by 
selecting a systematic sample of m of the ilf clusters, and enumerating all elements 
within each cluster in the sample. Then, we may write 

(6.1) mi' = y'. x ., 

where is the mean of the I'th cluster selected for the Sample. From [1], it follows 
then that 

Vi- = — |1 + (m — l)pc} 

fn 

Jr 

where Mel = 2 (*i- — ^)^ and pc is defined as p* in [1, p. 6], but with x, in 
1-1 

place of x,. Now from the theory of the random sampling of clusters it follows 
that 

el = ^ {1 + (N - Dp! 
where v* is the variance of the population, i. e. 

MNe^ = £ E - xY 

•-1 i-i 

and p is the intraclass correlation coefficient of elements within clusters, i. e. 

e‘p = el — e%/N — 1, 



352 


WILLIAM G- MAIIOW 


where 

MNa-l = S 2 - nr,)-'. 

t-i ,“i 

Thus 

(6.2) = -4 {1 + (i\r - Dp} {1 + (m - DpcI. 

mN 

Of the three factors in (6.2), ^/mN is the variance of a random sample of size 
mN selected with replacement; 1 + (JV — l)p is the factor arising from the use 
of clusters; and 1 + («r — l)pu is the factor arising from the fact that the clusters 
are sampled systematically. 

b. Straiificaiion and mbsampling. When we consider the possibilities of stratifi¬ 
cation and subsampling, the number of possible designs increases tremendously. 
For example, it would be simple to calculate the vai'iances of arithmetic means 
obtained by stratifying the population, selecting sampling units with probability 
proportionate to size, subsampling systematically, again .subsampling systemat¬ 
ically and finally subsampling at random. However, such slndic.s may be left 
to be made in connection with the practical problems in which they are to be 
used. Rather than attempt to consider many of the possibilities that might 
arise in practice, we shall here give only the results of the systematic subsampling 
of a systematic sample. The variances of many other designs may easily be ob¬ 
tained by means of Theorems 1 and 2. 

Suppose now that from each of a systematically selected sample of m clusters 
we subaample, systematically, n elements. Then, let our estimate of x be x' 
where, if xL is the ath selected element from the ith sample cluster, then 



From Theorem 2, it follows at once that 

(6.3) cl = 4 . {1 (JV - Dp){ 1 + (m ^ Dp.) {1 + (m - Dpi), 

where <r< is the variance within the ith cluster and pi is the average serial cor¬ 
relation within the fth cluster as defined in [1, p. 6], It is simple to calculate the 
variance of x' also when the sub-sampling is done by considering the m clusters 
in the sample as one population from which a systematic sample is selected. This 
is the case that occurs when a sample of blocks is selected and all the households 
on the .sample blocks are listed serially, a systematic sample then being selected 
from the lists. However, for our present purposes it is the analysis of (6.2) that 
is important and we now turn to a brief discussion of (6 2). 

The most important conclusion to be drawn from (6 2) i,s that the systematic 



SYSTEMATIC SAMPLING 


353 


selection of clusters even when systematic selection is desirable, may not com¬ 
pensate for the increase m variance caused by the use of clusters. Systematic 
selection will provide the same relative gains but these gains may not be large 
enough to produce the inequality 

{1+(N- 1)p 1 {1 -f (m - l)p.l < 

A problem that we have not worked through is the following; By regarding 
the elements of the population as random variables, we obtain conditions on the 
average correlations among elements of a single cluster as well as on the average 
correlations among elements of different clusters that enable us to state where the 
systematic sampling of clusters of equal sizes may be expected to yield a smaller 
variance than the random or stratified random sampling of clusters or of indi¬ 
vidual elements. This solution should be straight forward. 

c. Systematic sampling in two dimensions. Systematic sampling in two dimen¬ 
sions occurs in such practical problems as the selection of a sample of blocks from 
a city or the selection of a sample of plots from a field 
In selecting blocks from a city, the procedure most often followed effectively 
reduces the problem to one dimensional form by first numbering the blocks of 
the city or a part of it, in serpentine fashion beginning, say, in the upper right 
corner of a map of the city and numbering the blocks in the top row from right 
to left continuing the numbering of the second row from left to right and so on. 
Then a systematic sample of these block numbers, and hence, of the blocks 
themselves is selected. Clearly, this procedure should not be the most efficient 
if neighboring blocks are highly correlated, since, to cite an unrealistic possi¬ 
bility, the possible samples might turn out to be columns of blocks of the city. 

A second two dimensional systematic sampling procedure might be that of 
selecting a systematic sample of the rows and a systematic sample of the columns, 
thus obtaining a grid sample This design too is inefficient when there is a “fer¬ 
tility gradient” along rows or along columns. 

The reason for the inefficiency of both of these procedures can be found by 
examining the formulae for the variances of systematic samples. If the numbering 
is serpentine, then it becomes illogical to expect that the correlogram is concave 
upwards and sharp deviations from that pattern may occur. In the grid design, 
which is a special case of the systematic sampling of clusters with systematic 
subsampling, we may examine (6.3) and note that the intra-class correlation 
coefficient p may be large enough for al' to be large even when p« is negative. 

Clearly, (6 3) suggests that the possible samples be so defined that p is as 
small as possible In square fields this might be attained by defining the possible 
samples to be plots of a Knut Vik square having the same treatment, and sim¬ 
ilar definitions of possible samples could easily be given for irregular fields. 
This subject is, hoAvever, left for further study.’ 


® One of the referees of this paper has drawn the author’s attention to an article [6], 
the data of which, especially Table 3, are in accordance with the opinions expressed above, 



354 


WILLIAM Q. MADOW 


REFERENCES 

[11 W. G. Madow and L, H. Madow, .“On the theory of aystematio aampling, I,” Anmk 
ofMalh. Stal., Vol. 16 (1044), pp. 1-24. 

[2] L. H. Madow, “Syatematio sampling and its relation to other sampling designs,’’ Am. 

Stal. Aaso. Jour., Vol. 41 (1946), pp. 204-217. 

[3] W G. CocHBAN, “Relative accuracy of systematic and stratified random samples for a 

certain class of populations,’’ Annals of Math. Slat., Vol. 17 (1946), pp. 164-177. 

[4] G. H. Habdt, J. E. Littubwood and G. Polya, Inequahltea, Cambridge University 

Press, London and New York, 1934, p. 43. 

(6) M. H. Hansbn And W. N. Hubwits, “On the theory of sampling from finite popula¬ 
tions,’’ Annals of Math. Slat,, Vol. 14 (1943), pp. 333-362. 

[6] F. G. Houbtbb and C. A. Black, “Sampling replicated field experiments on oats for 
yield determinations," Sod. Set. Soe. Pros., Vol. 11 (1946), pp. 341-344. 



PROBLEMS IN PLANE SAMPLING 


Bt M. H. Quenouille 

RoOiamsted Experimental Stattoriy Harpenden, England 

1. Summaiy. After consideration of the relative accuracies of systematic and 
stratified random sampling in one dimension the problem of estimation of linear 
sampling error is discussed. 

Methods of sampling an area are proposed, and expressions for the accuracies 
of these methods are derived. These expressions are compared for large samples, 
with special reference to correlation functions which appear to be theoretically 
and practically justified, and systematic sampling is found to be more accurate 
than stratified random sampling in many cases. Methods of estimating sampling 
errors are again considered, and examples given. The paper concludes with 
some remarks on the problem of trend in the population sampled. 


2. Accuracy of systematic and stratified random samples in one dimension. 
W. G. Cochran [1] has given expressions to the variances of the means of samples 
of size n drawn from a population xiXi • • • Xnk when the method of sampling is 
random (r), stratified random (st) and systematic (sy). He assumes the elements 
XiXi ••• x„k to be drawn from a population in which 

E(xi) = fi, E(x, — pf = E(x, — p) (®,+« — p) = Puc 


where pu > p« > 0 whenever u < v, and derives the expressions 


( 1 ) 

( 2 ) 

(3) 


j)[^ kn{U- 1) .5 

• [' - S S *" “ **■]■ 


Using these expressions which are linear functions of the Pu Cochran compares 
the relative efficiencies of the methods of sampling for several types of correlo- 
gram. It is worth noting that (1), (2) and (3) can be derived under more general 
conditions than Cochran considered. If we assume that (a) each is a sample 
from a population with mean pi and variance , (b) that p, is distributed about 
mean p with variance , (c) that E{p, — p) {pj — p) = p,j<T^, and (d) that 

2 difficult to show that (1), (2) and (3) 


355 



366 


M. K. QUENOUILTjE 


1 ( l\ 1 

require the addition o£ a superposed variation-fl — ^ ^ £ it? to the right- 

hand side of the equations. Thus it should be remembered that Cochran’s 
results give theoretical maxima to the relative efficiencies of the various methods 
of sampling, while pu is the mean correlation between samples u apart. This 
result is perhaps interesting in connection with sampling for say, insect infesta¬ 
tion, when at each point there will be a mean level of infestation and the sample 
will be distributed in a Poisson distribution about this mean. Then the superposed 
variation is 


1 

n 




If we are sampling a continuous processS for n large we can write down the 
integral equivalents of (1), (2) and (3) 


3 


2 

(T 


n 


<4) 

( 6 ) ^ ~ ^ 2 5 

where pu is the mean correlation between successive elements of the sample, u 
apart and d is the mean distance between samples. Wo have thus 



which can often be used to investigate, quickly and roughly, Avith the aid of a 
graph the difference between the efficiencies of stratified random and systematic 
sampling. Fipre 1 shows how this is done for four types of correlograra. 

For a continuous Markoff scheme, we have p,, = p'' and 


o-,< — 1 4- 

n L 

<r.v ~ - 1 + 

n I 


2 

logp'' 

2 

log p-' 


+ 

+ 


2 

(log p^y 



2 p'^ ' 

(log p<')*J ’ 


which agree with Cochran’s results. 


3. Replication and the estimation or error. Yates [2] has pointed out the 
difficulties attached to the estimation of error for a systematic sample. It will, 
hoAvever, be worthwhile to investigate this point using the above formulae. 

1 In practice we can sample a continuous process only as if it were a discontinuous process 
With k large. 



PROBLEMS IN PLANE SAMPLING 


357 


For ^random, stratified random and systematic sampling, if n is large and k is 
regarded as constant, then the variance of the estimate of the mean will be of 
the form c^F(k)/n, where F{k) is virtually independent of n. Thus, if we have 
any method which provides an estimate of error for the samples it will be possible 
to split the series to be sampled into several equal parts (or blocks) to obtain an 
estimate of error of the mean of each part and to combine these to obtain a more 
accurate estimate of the error of the overall mean. In fact, if n is very large, we 
may wish to reduce our number of observations by obtaining estimates of error 
from a random selection of these parts. For stratified random sampling, FQc) is 
completely independent of n, so that we may combine our estimates of error from 
each strata. This leads us to the commonly used method of taking q randomly 
chosen elements per strata, and combining the sets of variances of g — 1 degrees 
of freedom to form an estimate of error. If we make our samples exclusive, 
i.e. no two elements can coincide, then this variance has to be multiplied by 
1 — q/k to give the estimated variance of the sample mean. 

We can in the same way estimate the variance of the mean of a systematic 
sample by using sets of q systematic samples of sufficient length with randomly- 
chosen starting points. This sampling will, however, be more difficult to carry 
out in practice, and we might consider other methods. Our systematic samples 
may be chosen to be invariable in each part or block into which the series is 
split so that our sampling procedure involves, in all, only q systematic samples, 
or we might follow the method advocated by Yates of choosing our g samples 
to be evenly spaced,-so that they are subsamples of a larger systematic sample. 
Whereas this latter method has simplicity and its possible incorporation into a 
more extensive scheme to recommend it, its use has to be very carefully con¬ 
sidered. If we consider the discrete case, we wish to estimate 



but any estimate of variance based on g evenly-spaced systematic samples can 
contain only terms of the form pku/q , and while an estimate of variance based 
on g randomly-chosen systematic samples will obviously be limited, it will, in 
most cases, be more representative. As an example, suppose we take k = 16 
and g = 4 then we can compare the relative occurrences of observing the correla¬ 
tions Pi • • • P 16 in the estimate of variance. Six examples of this are given in 
table 1, the random numbers having been drawn from Fisher and Yates tables, 
Pu and pi6_M being shown together, since they occur equally frequently. The 
table demonstrates how randomly-chosen samples, even as nearly systematic 
as the first two randomly-chosen samples will avoid systematically sampling the 
correlogram. It is obvious that in most cases either method will be fairly good 
but the use of this latter will usually be the more accurate. Comparisons are 
made in table 2 for various types of correlogram using the samples indicated 
in table 1. It is, of course, possible to postulate theoretically many kinds of 


“ Throughout this paper 6 is used for the differential sign to prevent confusion with d. 



358 


M. H. QUENOUILLE 


correlogram for which the equal-spaced sets of systematic samples will break 
down, but ultimately we must decide with reference to the types of correlogram 

TABLE 1 


Frequency of occurrence of the serial correlations pi, pt ... pu in the estimate of 
variance when 4 systematic samples each voith spacing t6 units are taken 



II 

Values of A 2^ Pu estimated by systematic samples 


Pit 

Sivcnly* 

ipacca^ 

Systematic samples with mnUom 

lUrtlng polaU 

Mean 

Ei- 

pccted 

Bftinplei 

1 

2 

3 


S 

6 

1-0.2 u, (u = 1, .6) 

0,17 

0.27 

0 20 

0.17 

0.30 

0.17 

0.13 

0.21 

0.27 

1-0.1 u, (u - 1, .10) 

0.53 

0.62 

0.68 

0.63 

0.60 

0.63 

0.63 

0.67 

0.60 

2-» 

0.04 

0.13 

0.12 


0.15 



0.10 

0.13 

2-»/* 

0.68 

0.66 

0.64 


0.66 



0.63 

0.66 

Kendall’s Series 1 

-0,14 

0.03 

0.00 

Bai 

0.16 

BSi 


0.01 

0.07 


* Katurally the use of this method of estimatiug the Bampling error assumes that the 
correlation between the corresponding elements in each part or block into which the series 
is split may be neglected, i.e. in this case that the terms p,t and above are negligible. In 


this cose pi4 « 1/16 and consequently tho term 2(T*ff 2 pwt«) 0.66* required in 

II 

(6) dilTers slightly from the term A 2^ pu — 0.65 which we are attempting to estimate. 

experienced. We shall consider this point further, after we have dealt with 
two-dimensional sampling. 

.4. Methods of sampling in 2 dimensions. The number of ways in which, we 
can sample a two-dimensional space’ is large, since we can employ random. 


• We shall, in general, consider our two-dimensional space to be rectangular, but it is 
not difficult to draw similar conclusions for an area of any shape. 






































PROBLEMS IN PLANE SAMPLING 


359 


stratified random or systematic sampling in either direction. Thus we will be 
able to consider every possible combination of these methods, e.g. random in 





Fio. 1. Graphical comparison of the efficiencies of systematic and stratified random 
sampling for various correlation functions. The thick line gives the function 

/>(«) - itpu/d, 0 < u < d 

" pii, d ^ Uf 

and the dotted line the function 


/,(«) — p,j , (i — l)d < u ^ id. 

Thus systematic sampling is more or leas efficient than stratified random sampling according 
to whether the area under the thick line is greater or less than the area under the dotted 
line. The most efficient method is indicated on each graph. 


one direction and systematic in another will be denoted by r-sy. Furthermore 
we can consider the sets of samples in one direction to be aligned with one 
another, or to be independently determined. The suffix 1 will be used to denote 










360 


M. H. QTJENOTJILiIjE 



aligned samples while suffix 0 will denote independent samples, e.g. we might 
sample according to the system tisyo , Examples of several methods of sampling 
ate given m Eigure 2. 












PROBLEMS IN PLANE SAMPLING 


361 


6 . Accuracy of sampling in two dimensions. Suppose we consider a sample 
of 7 iiTi 2 elements drawn from the elements a:, 3 (f = 1 , 2, - ■ ■ niki ,j — 1,2, • • • nik ), 
(which form a single finite population drawn from an infinite hypothetical 
population), such that the mean spacing in the two directions is kx and k^. 
These parameters will, if necessary, be indicated in brackets after the method of 
sampling, e g. rxsy(x{nxkx , n^hi). 

Let X denote the mean of a sample formed by the method considered, and 
x' a member of this sample Suppose, also, that the x,, are drawn from a popula¬ 
tion in which 

E(x„) = M, E(xi, — = (/, 

E{x,j — m ) ix,+u,j+v — m ) = PtJuvX, 

Further we may average p.yui over all possible values of i and j to define Puv = 
P^.-v by the relation 

PtJUU ” (.kiTlx I U.|)(fc2ll2 ““ I V l)puv* 


The purpose of these definitions is to allow to eliminate the difficulties associated 
with the parameters of finite populations by considering this population as 
being itself a sample from an infinite population. Cochran employs a similar 
device. 

5a. Random sampling. It is not difficult to see that 

c\X) = i E(Xx - = E{Xx - pf - EiXx - p) iX, - p), 


where Xi and Xt are independent samples. 
Also 


EiXx - p){X, -p) = Eix[ - M - p) 


kik2ni 712 


1 + 


1 


fcifc2 7iin2 


2 £ (^1^1 ~ I “ Dihnt — 



where the double summation^ exists over the region S given by 1 m | < kiUi, 
I V 1 < kioz and excludes u = v = 0. We thus have to evaluate E{Xx — pY for 
the different types of random sampling. 

It is easily shown that 
2 


E(Xx - p? = 


nx 712 



_ nx 712 — 1 

kikinxniikxkznxnz 


1 ) 


Z) £ "■ 1 'll |)(fc2'H2 — 1 « |)p«« 


2 r 

= — 1 + 

711772 L 


Til — 1 


kxkznxQcxkznxTiz ~ 1) 

+ 


for ror„, 

Z) Z) (/ill’ll ~ Dihnz - (p [)p„„ 

2(712 - 1 ) ^ 1 

hnzihnz - 1) S 


* In general, unless otheiwise stated, double summations will exist over the region for 
which the coefficients are positive, excluding u = « = 0 



362 


M. H, QTOENOUlLLIi 


a 

ni?ij 


+ 


for nro, 

[' + S (*.». - - 1.1),„ 

for nn, 


kini(.hn2 


whence 


( 7 ) 


nin,0 

.[l_1_ 

L kik2nirh(kik2nirh 


L Z - I w l)(fcs«s - 1 V l)p«„j 


■ (n ro) = — (l - [l - ^ 

tilth \ hh/ L {hki ~ l)kih 


kith — 1 


ki til th(ki hrntii — i) 


( 8 ) 


•Z Z - I M l)(*anj - I V |)p„„ 

tilth \ «1 ki/ L 


h ktitii + nj — 1) — 1 


( 9 ) 


‘Z Z (^1^1 ~ I « Dfeni - I V |)p„v + 


(ki ki — l)A:i ks ni th(ki kithth — 1) 
2ki(tii — 1) 


(kiki — l)Th(kith — 1) 

• £ (kitii - v)p„ + -- £ (kim - m)p„o1 . 

1,-1 (kikz — l)ni(kini — 1) u-i J 

5b. Stratified random sampling. We can deduce the variances for some methods 
of taking stratified samples if , the mean of the elements sampled in the ith 
stratum, is independent of , since we will then have 

BiX - sf = B($'i ~ ®,.)Vn, 

where 5 is the mean of the finite population which is sampled. Hence 

o^istito) = — (r“(roro(l, fci ; Tijfcj)) 

Til 


(IQ) iVaTi 

Jii wj \ ki ki/ L 


kikithikikith — 1 ) 

'Z Z (^1 - i M \)(kitii - 1 V |)pu.^, 



PROBLEMS IN PLANE 8AMi*HNG 


363 


aiskro) = (l -^ V’ Tl - -_ 

MinjV kikij i_ hihiTiiikih — 1) 

(11) *£1] (fci - I u Dfen, - I D |)p„ 

^ 2h(rH - 1) n N 1 

(fci h - l)m{h ni - 1) h. ^ J ’ 

(12) ^ ninj( kih(kih — 1) 

IttDfe- lt.|)p«]. 

To estimate the variance of other methods of sampling, we will make use of a 
general formula which we might have used to derive the expressions (8)-(12). 

If x'i is any element of the sample X, then 

- ^)' = E (*: -xf-i: (x; - 


whence 

ff\X) = E(X - xY 


= — [i: (xl - f)* - (x: - m)* 

nin2L ninj ^ 

+ £ (x( - A*)(xy - m)*1 » 

riiTw J 


_ klk^nln^ — 1 j”^ _ _ 

ki kt riiTh L ki 

• ihrii - 1 V l)p„,j - - 


kini.rh,{kikin\m — 1 ) 


^ 2 {kiHi - I w I) 


niTh — I 2 . nin2 — 1 


E(x; - <i)(x; - a) 


niW2 0 hk^ ^ kikinini(kikt _ i) ^ ^ 1“ D 

• (fe - I. \)>„ + - >)(»,' - .) ]. 

Kl/Ci — 1 O’* J 

Thus, provided that we can estimate E{x[ — a) (xi — n)/o^ the expression (13) 
gives the error for all methods of sampling. 

As an example, we might deduce the expression (12). If we choose any member 
Xi , then a second member x/ will be located at random with respect to x< except 
that there will be kik 2 — 1 positions in the same stratum as x', that xj will not be 
able to occupy, Thus the expected correlation E(xi — a) (®j ~ m)/v* wiU be 
given by 

klk^^n^n^kn^'-l) ^ ^ 

~ klklCniih - 1) S - 1« l)(^t - 1 V |)p„.. 



364 


M. H. QTJENO-ffILLE 


If we substitute (14) into (13), v/e will obtain expression (12) for the variance of 
siosto . In the same manner, we can derive for siish the expression 

E(xi - - /i) 


kihinirii — 1 ) ycikinitit 


11 


2 £ (.hni — \u\){hni — \v |)p„ 


_^ 

ki ki rii 




(Li 7h~ I U I) {h ~ I V |)p„„ 


(15) 


ki ki 'til 


m2 (h - [u Dihrii — 


w l)pui) "h i.ki I w |) 


2(kiktni - 1) to 

h niOci rii — 1) ^ 

_10 n _ \ I 2(fci hjlli — 1) 

Uk, - 1) + hrhihn, - 1) ir 


(fe — I y |)puv + 


2(^1 ki 


On 

{kiTii — y)paD 


2{kiki — 1) ^ 

h{ki ~ iT w 


(A'j.~ y)p„„ 


Thus we can evaluate ff‘“(J) for all types of stratified random sampling 
5c. Systematic sampling. In a similar manner to tliat used for stratified random 
samplbg, we can use (13) to evaluate the variances of systematic sampling. 
Values of E(x', — p) (x/ — p) for three of the posisible methods of sampling are 
given below. For ayisyi 


(16) E(x: - iO(x', - p) 


nxn2(nin2 - 1) IwDCPa 1 y |)pftxu.A:a» 


For syiTa 
(17) 


Eixi p)(x; p) - klmminnii - 1) ^ ^ “ I'' I) 


Aa na 


• ih Ui I y Dpxxu.. E (A* ni ~ v)p. 


For si/osj/o 


E{x[ p)(x; p) - __ E E (/<•! - I«I) 

{hni — I y \)pm — 20 E (A'l Pi I « |)(/c2 — I y Dpuh 


1 

/Ci Th 


E E (h 1 u |)(A'2 722 — I y l)pu„ + - - E E (Ah ""1^1) 

A’] lii 

■ {h - |y Dpuv + r~ E E (>h ~ I u |)(/C 2 - I V |)pfti«,„ 

/i2 


( 18 ) 



PROBLEMS IN' PLANE SAMPLING 


365 


~ £ (^2 l')pOll + j—^ £ £ (fcl 1 w |)(^ — I p |)p«il:ju 

^2 •'■“1 'Cl Wj 

- S £ ». -»),..]. 

/Cl u-l J 

The derivation of (18) may be compared -with that of (16). 

6 . Effect of alignment. We can examine the effect of alignment either by 
an examination of the values of the variance of different samples, or by the 
direct use of (13). For random and stratified random sampling, the effect of 
alignment is to increase the variance of the sample by an amount 

SS au.D(pQii Pliv) “ 1 “ SS 6 uij(pu 0 Pud) WhCPe Cluv ^ 0 , 

buv > 0 . 

This will be positive for monotonic decreasing correlation functions, and for the 
majority of functions realised in practice. Thus alignment will usually increase 
the variance for random and stratified random samples. 

For systematic samples, the position is more complicated, but, roughly, the 
variance is increased by an amount 

(luv{.Pkiu,kiV ~ PkiUiktt)i 

•where > 0 and pkm.kiv is a mean over a rectangle, centre pa,u,ajd for u and v 
non-zero, and is a mean over a line, length h centre po, k^v for u zero, (and similarly 
for V zero). AVhether this is positive or negative will depend on the correlation 
function, and it will have to be investigated for the types of correlation function 
which are encountered. 

7. Limiting forms. For a continuous process, when Ri and Rj are large, we 
may, in the same manner as for linear sampling, obtain integral approximations 
to the sampling variance, provided that S2 pdiu.d^v converges. 

We thus have 


(19) 

<r°(i‘oro) = (^^(sJoro) <rVRiR2, 


(20) 

+ II 


(21) 

r 2 r* 2 r* 

- 1 "h j / PovSv -{• T 1 PudSU 

7ll 712 L ^2 ^0 •'O 

1' 

(22) 

~ - is L L[ " 1 “ i>'”" *“+ 

li 

(23) 

(T* r 1 

cr^(sfo SU) ~ ~~ 1 — 12^2 I 1 (dl ~ 1 I)(d 2 “ 1 

TZj. 7 I 2 L J—df J—di 

D Dpud Su 



366 


M. H. QUENOUILLB 


(24) 


- ^ C, L ' i>'" *“ *'+M L L 

2 f* 2 

■ (rfa — I v Dpud fiw 5a 4" ^ j Puo ~ ^ Jj ~ m)puo 5u 

ff’ r If" r“ 

ff“(syiro) - 1 — / / p««5m5v 

" ni nj |_ fll <22 J-w J-»o 

+1 /«~ ^ i ’ 

(26) <r=(s2/iS2/a) ~ ~ P<,.«.a„ - f ^ Pu.5Mi«] , 

Amm) ~ [i - £ I,, “ 1 “ 

-5^»£7j^-ih)p. 


(25) 


,5ti5y 


(27) 


f f (^i ~ l't*l)(^ “ I a Dpwi) 5125^ 

Ctj (*2 J—dj J—di 
I «> fit 1 fit 

4" "5 £ / (<^ — I a |)p(f]U,ii 5a ^ / (^ 1 |)poi)5a 

(Zj 14*—M •Ldj 

1 « I rdi "I 

+ ^ I (<^1 ~ I'M |)ptt,<l,T5u ~ '3 f (di — 1m|)puo5m I. 
dj ao J—dj, ^1 J 

8. Particular case where pu, = pup». We note that, if put = PvP.* most of 
these forms can be simplified greatly. If we write 

2 f* * 

1 “ J / + 2 23 Prflui 

Oi Jo u-1 


SJ/u 


2 

sl« = 1 — / (di — u)p«5'W, 

Ml Jo 

with similar forms for a^/v and ai,, and, also 

2 /•“ , 2 /^> 

/i “ ^ J Pi)5a, /i “ ^3 J (do a)pi,5a, 

2 f" , 2 

/o = ^ jf^ Pu5it, ^ ^ I “ w)pu5m. 


/i — 2 ^ pd,v, 

•—1 

CO 

f;' = 21: 


11-1 


I Pjjtt ; 


‘ A sufficient condition for this to be a valid autocorrelation function is that both pu 
and Pu should be autocorrelation functions. 



PROBLEMS IN PLANE SAMPLING 


367 


then we have, for 

example. 


(28) 

(r*(ri ro) 

2 

O’ 

- 

Til 712 

(29) 

ir*(nn) 

2 

■ - ■■ 

Til 712 

(30) 

(r*(sfosfo) 

2 

<r 

■ 

TliTla 

(81) 

(r*(8fi sh) 

1 

Til 712 

(32) 

ff{syiSyi) 

2 

C 

ntJ - 

Til 712 

(33) 

tr\sya sya) 

2 

<r 

■ 

Til 712 

From these we get 


<r*(sfisfi) — 

(34) 

ff*(aj/i8J/i) 

2 

<r 

/>j - 11 1 

nyTh 



[(sfu 

(36) 


2 

riiih 


(1 + /l + ft), 


(sLsU +f'isyu +fisy,). 


[(8fu8«, - syusy^) + fiistu - syu) + fi{sU - sy„)], 

1 

(T 
rs,f 

riiih 

[|(1 - fiy„)(l - sy„) - (1 - sQ(l - sQ] +/'/8y„ + ftsy.,], 

2 

(36) ff\sUsta) - Am^yo) ~ — [fiistn - syf) + f^(.sU - syv)l 

niTh 


The forms (34), (35) and (36) enable us to compare the variances of the samples 
in two dimensions by using the one-dimensional results. For most practical 
cases, we know that the /’s are positive, stu > sj/u and sU, > sy,, so that 

(37) ff{stisti) > (T^isyisyi) > v(sksh) > v (sj/osj/o). 

The values of (r*(8ios<o)/<r^(r(iro), <r*(syisyi)/(r*(roro), (7-*(8j/oSj/o)/<r*(roro) and 
a-’‘(Btosto)/(r^(syosyo) for pdiu = pl“' and pd^v = pi”' are given in table 3. It is not 
difficult to show that for a given number of samples, (di , di fixed), (r^(stosto), 
<r‘(syisyi) and v{&y„syi,) are least when pi = pj. The expressions tabulated have a 
value of 1 for pi = P 2 = 0 and tend to limiting values of 0, 2/3, 0, and 2 respec¬ 
tively as Pi and pj tend to 1. It is interesting to note that for pi and p 2 differing 
by more than 0.4 the grid imposed by syisy, is less efficient than purely random 
sampling. The type of function p„„ = pupv is, however, less likely to be realised 


' For a town survey, we might find the correlation between two points'depending on a 
within-atreets and a between-streets correlation, so that this function could be realised. 



TABLE 3 

Gorrvpanson of the efficiencies of systematic andrandoni sampling for various values of pi and pj 


PI 

pi 

0 

0.1 

02 

0.3 

04 

0.5 

0 fi 

0.7 

oa 

09 

10 


MR 

iW 

Qiffl 

ndunij 


1^^ 





1.000 


Hfififij 

1.222 

Hi^i 

1.867 

2.333 



6.667 

liHiTtli 

£iE!S!! 

00 


iKi7i:i 

1.000 

nfii!!! 

■Hjlilll 



WlTtii 


1^^ 

niSTS!i 

1.000 


H 









1.000 

1 000 



0.720 

0.669 

0.632 


0 676 


0.629 



0.471 

0 1 


0 739 

0 754 

0.827 

0.956 

1.160 

1.488 

2.055 

3 215 ' 


00 



0.596 

0.634 

dkPI 

ma 

0 437 

ligitii 

0.398 

niKBW 

lilEiwI 

0.364 



1.21 

1.25 

1.28 


1.31 

1.32 

1.33 

1.33 

1.33 

1.33 





0.665 


0.497 





0 376 

0.2 



m 

nESSj 

0.721 

0.788 


1.134 

1.632 

2 362 

4.911 

00 




BiffiSI 

0.416 

[1KI:91 

HBiw 

0.328 

ItKliM 


0.272 

0.267 




1.32 

1.36 

1.39 

1 41 

1 43 

1.44 

1.46 

1.40 

1.46 





m 

0.476 

0.441 


EH 

0,364 

0.328 

0.306 

0,3 




EH 

|j»^ 

0.778 

0 924 

Vm 

1.826 

3.761 

00 





EH 


0.297 

0.271 




0.198 





1 41 

1.46 

1.49 

1.61 

1.63 

1 64 

1.66 

1.66 






0.432 

m 




0.272 

0.247 

0 4 





0.680 

EB 

0.787 

I3ii^ 

1.437 


CO 






0.288 

EH 

0.229 


0.186 

InitM 

0.161 






1.60 

1.64 

1.67 

1.60 


1.63 

1.64 







0 364 


0.284 

0.263 

0,223 

0.106 

0.6 






0.076 

0.703 

0.821 


2 228 

eo 







0.223 

■imtif 

0.171 

ESE: 


0.116 







1.69 

1.63 

1.66 

1.68 

1 70 

1.71 









0.243 

0.210 


0,161 









0.712 

0.908 

1.679 

CO 








HiflS 

0.142 

0.121 

EXcS 









1.67 

1.71 

1.74 

1.76 

1 78 









0.206 

0.172 

0.139 

0.109 

0.7 








EM 

0.742 

1.226 

00 









0 118 

0.096 

0.077 

0.060 









1.76 

1.79 

1.82 

1.84 










'HP 


0.070 










0.667 

0.803 

DO 












0,037 










1.84 

1 87 

1 89 

0.9 










0.067 
0 667 












0.035 







1 





1.92 

1.96 


368 








































































PBOBLEMS IN PLANE SAMPLING 


369 


in practice than a centrally-syminetric function, which is independent of the 
choice of axes. For this reason, we consider next this latter type of function. 


9. Centrally-symmetric correlation functions. Dedebant and Wehrte [3] 
have given a necessary and sufficient condition for p(u, v) to be a correlation 
function as 


-flO 

(38) p(_u, v) = I / cos (tow — /Xv)SF(u, p), 

J—aa J—.ca 

or alternatively, 

(39) /(u, p.) = j j ^ cos (cow - /i.)p(M, v) Su dv. 


For a centrally-symmetric correlation function we can put w = r cos 6, p = r sin 5 
then p(w, «) = p(r) and 


/(to, w) = / / cos (r Vto* -i- cos ei)p{r)r ddi dr, 

Jo Ja 

where 0i = 0 -f tan“^(/i/to), 
~ Jo(fr)p(,r)r dr, where t = Vto^ + ju* 


Thus, if p(w, v) is centrally-syatematic, then so is /(w, p) and conversely, so that 
we get 

(40) f(r) = ^ ^ Jc(rT)p(r)rSr, 
and 

(41) p(r) = 2r f Jo(rT)f(r)TST. 


We can thus find suitable forms for p(r) and/(T). In this connection the formula 

f" _ 6" 

I J o(j/ 2 )e “"Sy = l/(a“ a > 0, is useful, since we can see that rT(e''“V2/) 

Jo Od" 

5 ” 

and — (a“ -j- are possible functions for 27r/(T) and p(r) although our choice 

must be limited by the stochastic nature of p{r) as well as by its convergence. 
Thus, for example, a = n = 0 gives l/2TrT and 1/r as spectral and correlation 
functions, but these will not converge. 

In the linear case, the Markoff process p(w) = e“““ had a spectral function 
/(r) = l/7r(a“ -|- r“) which is a Cauchy distribution in one dimension. If we take a 
two-dimensional Cauchy distribution^ as our spectral function we get /(t) = 


’ In the same way as the ordinary Cauchy distribution can be considered as a density 
distribution on a line produced by a point source at a distance a, radiating in all directions, 
so can a two-dimensional distribution be considered as a density distribution on a plane 
from a source at distance a. 



370 


M. H. QUBNOUILLH 


o/2n-(a’ + T®)'" and p(r) = — — (e~"'/r) = e~". Thus it appears that a generalised 

Olv 

Cauchy distribution will be the spectral function for a generalised Markoff 
process. 

We can, of course, consider an ’’elliptical” Markoff process given by* 

f \ 2muv 

(42) = exp 

but, in what follows, to simplify the computation, m will be taken as zero, so 
that by changing the units in which di and da are measured, we will work with a 
process p(r) = e~^'. 


TABLE 4 


Comparison of observed serial correlations with theoretical values obtained from a 
centrally-symmetric correlation function 



j Rows 

Columns 

North-east 

South-east 

in miles 

Ob- 

Calou- 

Ob- 

Calou- 

Ob- 

Calou- 

Ob- 

Calou- 


served 

lated 

served 

lated 

served 

lated 

served 

lated 

1 

0.332 

0.368 

0.310 

0.368 

— 

_ 

— 

_ 

2 

— 

— 

— 

— 

0.264 

0.243 

0.264 

0.243 

2 

0.149 

0.136 

0.090 

0.135 

— 

— 

— 

— 

2V2 

— 

— 

— 

— 

0.060 

0.069 

0.129 

0.069 

3 

0.009 

0.050 

-0.029 

0.060 

— 

— 

— 

— 

3V2 

— 

— 

— 

— 

-0.060 

0.018 

0.070 

0.018 

4 

0.034 

0.018 

-0.041 

0.018 

— 

— 

— 

— 

4V5 

— 

— 

— 

— 

-0.020 

0.004 

0.060 

0.004 


This process does not seem to be far removed from the type of correlation 
function experienced in agricultural field work.’ Osborne [4] has mentioned 
the possible use of pu =■ Mahalanobis [5] has calculated correlations for a 
paddy field of 800 cells; his values are shown in table 4, together with values of 
the function e~'. Bearing in mind that the standard error of each of Mahalanobis’ 
values is approximately 0.035, the fit is seen to be quite good, although an 
elliptical process with axes running south-east and north-east would undoubtedly 
fit the observations better. 


SiUd p u V 


exp 


it,p 

ir) - < 


- 

U V 

1 

1 

a 6 

1 


"■ will be called the circular Markofl process, while pu, =• pl‘‘'pi'' 
will be known as degenerate Markoff processes of the first and 


second orders. 

* This is further supported by the fact that using a function of this kind it is possible to 
obtain numerically a law in substantial agreement with Fairfield-Smith’s law over a wide 
range of values. 




































PROBLEMS IN PLAJTE SAMPLING 


371 


10. The relative efficiencies of systematic and stratffied random sampling. 
Ideally the correlation functions developed in the last section should be used 
in the expression (19)-(27), but these functions are not capable of easy integra¬ 
tion. An alternative approach can be made if we note that 


(T^(stosiij) — <r' 


(43) 


j r(i 
02 \ 02 / 


+ 


where 


F(u, ds) — j" I I y Pnv^ "b f PuvSv dj pu.djv 

da L •’0 da Jd, v-i J 

di) " ^ I j ^ puv^U "b f puv^U ■ dl pd^UtV I ■ 
di LJo dl Jd, «-i J 


It is seen that F{u, da) and F{v, di) are extensions of the expressions obtained for 
(dw — in section 2. Hence, if ?■(«, da) and F{v, di) are both positive 

functions, systematic sampling is more accurate than stratified random sampling. 
A particular case of this occurs when pu« = pipl- However when puv = exp 
(— (u* -b y*)*^*}, F(u, da) is not always positive, since, as u increases, puv becomes a 
convex function of v. This complicates the interpretation of (43) greatly since it 
appears that as u varies from 0 to di, F(u, da) varies from + » to an unknown 
value X. This value will be positive if da > > di and negative if di > > da so 
that if the sampling is disproportionate in the two directions systematic sampling 
will be more efficient than stratified random sampling. Furthermore, if di = di =d 
and d —> 0, F(u, d) —> 00 and systematic sampling again appears to be more 
efficient. Thus in a wide variety of cases this type of systematic sampling i.e. 
syasyo gives a more accurate result than random sampling. 


11. Estimation of sampling errors. An examination of formulas (7)-(18) 
shows that the principles used for the estimation of linear errors can be used in 
plane sampling. If we consider that each sample can be broken up into inde¬ 
pendent units each of which is situated in one of s strata, then for g replications 
we will have gr — s degrees of freedom for error. For example, roro, fori, sIqU 
and atari will have qnini — 1, g7i2 — 1, gnirii — m and grtz — 1 degrees of freedom 
respectively, so that a single sample will contain an unbiased estimate of error, 
but atosto , stoati , ahati, ayasyo and ayiayi will have nin 2 (g — 1), nziq — 1), g — 1 
and g — 1 degrees of freedom and will require replication to form a valid estimate 
of error. We can however use the method of splitting our sample into several 
parts each of which will give a fairly accurate estimate of error. We may, again, 
consider the possibility of using a set of systematic samples, which are evenly 
spaced, to estimate the sampling error, and we will see that the exclusion of the 
p's of lower order may lead to appreciable bias unless the correlation between 



372 


M. H. QTJENOUIIjLE 


successive terms of the sample is small, but, as Yates has pointed out, this 
method will provide an upper limit for our sampling error. These methods of 
sampling are illustrated by the examples given below. 

12. Examples. We shall consider the three methods of estimating the sampling 
errors of a systematic sample: 

( 1 ) using sets of systematic samples rwidomly placed with respect to each 
other, i.e. the material to be sampled is broken up into a series of sub-areas 
or blocks and several systematic samples are taken in each block; the 
error variance is calculated from the variances of the systematic samples 
in each block, 

( 2 ) using one set of systematic samples randomly placed, i.e. several sys¬ 
tematic samples are taken and the area is then broken up into sub-areas 
or blocks; the error variance is calculated from the variances of the 
portions of the systematic samples in each block, 

(3) using one systematic sample i.e. one systematic sample is taken which is 
broken into several systematic samples of wider spacing, e.g. four samples 
at four times the original spacing, the area is then divided into several 
sub-areas and the error variance is calculated from the variances of the 
portions of the sub-systematic samples in each block. 

These three methods are increasingly accurate in their estimation of the 
mean, increasingly biased in their estimation of the sampling variance, and 
decreasingly difficult in their practical application, so that our method of sam¬ 
pling may vary according to the population and according to the use to which the 
results are to be put. It is, for example, conceivable that subsequent sampling 
will yield an improved estimate of error so that initially only a rough guide 
may be required. 

a. If we are sampling from a continuous linear population with a large number 
of observations in each part into which we split our series, methods ( 1 ) and ( 2 ) 
will both give accurate estimates of the variance per term 

^ P«5u + 2 ^ 

Method (3) will, however, estimate instead of the correct variance per term, 
which is 

Thus the estimates of sampling variance by method (3) will in general be higher 
than the estimates by methods ( 1 ) and ( 2 ), although the actual variance will be 
lower. 

b. Kendall [ 6 , 7] has constructed 480 terms of an artificial series Un +2 = 
1.1 Un+i — 0.5 u„ Sn +2 where the tn are rectangularly distributed from —49 
to 49. For this series = 2379,81 and = 2536.11. The series was split in six 
parts of 80 terms, for each of which n = 5, fc = 16, 3 = 4, so that 18 degrees of 
freedom were available for error. The results for this sampling configuration are 



PROBLEMS IN PLANE SAMPLING 


373 


given in table 5. The values in this table corroborate the conclusions for large 
samples of continuous populations. 

c. A number of uniformity trials were taken and sampled according to the 
systems shsh and syisyi. For sampling according to the system stish the error 

TABLE 5 


Comparison of three methods of estimating the sampling error of systematic samples 

for an autoregressive scheme 


Method 

Estimate of sampling 
variance per term, s*, 
based on 18 degrees of 
freedom 

E (s*) 

True sampling 
variance per term 

(1) 

3228 


2170 

(2) 

1872 


2167 

(3) 

3709 


423 


TABLE 6 


Comparison of effioienciea of different methods of sampling on three uniformity trials 


Source 

No In Cochran’s 
(11] Catalogue . .. 

Crop • . 

No. of Plots c 

Mean ... . 

Variance per term 

Katamkar fS] 

72 

Potatoes 

576 

25 262 

15 555 

' Wiebe [9] 

132 

Wheat 

1440 

' 5fi7P5 

10,018 0* 

Wynne Saves and Karlshna, 
lyes IlOJ 

log 

Sugar cane 

oto 

2?0.g9 

1794 « 

Type of sampling . 

sti Sti 

syi syi 

syi syi 

sti sti 

syi syi 

syi sy. 

sti sti 

syi syi 

syi syi 

Proportion sam- 




1 


1 




pled. 

V,6 

1/6 

1/6 

1/9 

1/9 

1/9 

1/8 

1/8 

1/8 

Method of eatimat- 







1 


1 

ing error. . 


(2) 

(3) 


(2) 

(3) 


(2) 

(3) 

No. of partitions . 

1 

4 

4 

1 

4 

4 

1 

6 

6 

ni , , , 

3 

3 

6 

4 

2 

4 

4 

2 

4 

fri, . . 

2 

2 

1 

3 

6 

3 

2 

4 

2 


16 

2 

4 

20 

6 

10 

16 

3 

6 

hi . 

6 

12 

6 

6 

6 1 

3 

8 

8 

4 

q . 

2 

4 

1 

2 

4 

1 

2 

4 

1 

Mean. , , 

23.140 

23 435 

23 323 

686.64 

698,66 

276.29 

276.29 

260.72 

271.27 

Estimated variance 










per term . . 

9.763 

2 689 

4.889 

6161,6 

6772.7 

7038 6 

1320 16' 

799.29' 

1269.64 

Degrees of freedom 










of estimated var- 










ianoe. 

48 

12 

12 

80 

12 

12 

60 

16 

16 


* Baaed on the onginal 1600 plots. 


was estimated by taking two samples per strata, while, for sampling according 
to the system syisyi , the error was estimated by comparing sets of four samples 
in each part of the series by methods (2) and (3). The results of this sampling are 
shown in table 6. While the number of trials is small, the trend to be seen in the 
results agrees very well with the conclusions reached above. 











374 


M. H. QTJENOUILLE 


13. Trend in the population. Frequently in taking samples from a population, 
we are faced with the problem of a trend. This will not greatly affect random and 
stratified random samples as estimates of the population mean, but the efficiency 
of systematic samples will be affected to a large extent. If we consider linear 
sampling, and denote by Si the sample whose first element is a:,- then the set of 
samples Si will usually be monotonic with i and the difference between ;Si and 
Sk will be large (roughly equal to xi — xt). 

Yates [1] has suggested a method to overcome this difficulty; by letting Si 
represent 


n 


■—j I *<+* + ■ • ■ + ®i+(n-2)Jt A -®n+Cn-l)Jij ) 


the difference between systematic samples due to trend is largely removed. 
It is easily seen that this necessitates a small loss of information, and in particular, 
for a continuous random population the variance is (n — -Dtr^/Cn — 1)’ instead 
of a/n. For plane samples, the corresponding adjusted sample wUl be 


Sii = 


(ft! - l){n2 - 1) 


' V 

_kiki 




+ 


j(ki — i) 

ki kj 






+ 


■iih - j) ^ 


+ 


+ 


(fei - i)(kt - j) 
ki ki 






with a similar loss of information. 

Trend is, however, moat likely to be appreciable in large samples, and in this 
case, the loss of information due to end adjustments is negligible, so that the 
conclusions reached above will remain unaltered. 

The author wishes to thank Dr. F. Yates and Professor M. S. Bartlett for 
advice in the preparation of this paper. 


REFERENCES 

[1] W. G. Cochran, “The relative aoouraoy of ayetematio and stratified random samples 

for a certain class of populations,” Annals of Math. Siat., "Vol. 17 (1946), p. 164. 

[2] P. Yatbb, “A review of recent statistical developements in sampling and sampling 

surveys," Boy. Siat. Soc. Jour., Vol. 109 (1946), p. 12. 

[3] G. Dedbbant and P. Wehrtb, "Meoanique aliatoir,” Portugaliae Physics, Vol. 1 

(1946). 

[4] J. G. Osborne, “Sampling errors of systematic and random surveys of cover-type 

areas," Am. 8tat. Aaso. Jour., Vol. 37 (1942), p. 266. 

[6] P. C. Mahalanobis, “On large-scale sample surveys,” Roy. Soc. Phil. Trans., B. 231 
(1944), p. 329. 

[6] M G, Kendall, "On the analysis of oscillatory time-series,” Roy. Stat. Soc Jour., 

Vol. 108 (1946), p. 93 

[7] M. G. Kendall, Contributions to the Study of Oscillatory Time-Series, Nat. Inst. Econ. 

Soo. Res., 1946. 



PROBLEMS IN PLANE SAMPLING 


375 


[8] R. J. Kalameab, “Experimental errors and the field-plot technique with potatoes,” 

Jour. Agr. Sci., 1932, p. 373. 

[9] G. A. 'WiBBB, “Variation and correlation in grain-yield among 1500 wheat nursery 

plots,” Jour. Agr. Rea., 1935. 

[10] Wynne Satbr and P. V. Krishna Itbb, “On some of the factors that infiuence the 

error of field experiments with special reference to sugarcane,” Ind. Jour Agr 
Sci., 1936, p. 917. 

[11] W. G, CoOHBAN, “Catalogue of uniformity trial data," Roy. Stat. Soc. Suppl. Jour., 

Vol. 4 (1937), p. 233. 

[12] P. Yates, “Systematic sampling,” Roy. Soc. Phil. Trana., Vol. 241 (1948), p. 346. 



' REPRESENTATION OF PROBABILITY DISTRIBUTIONS BY 
CHARLIER SERIES* 

By R. P. Boas, Jh. 

Brovm University 

Summary. The paper describes some results concerning the representation 
of a function by linear combinations of the successive differences of the Poisson 
distribution, not necessarily the partial sums of the type B series of Charlier. 

1. Introduction. For various purposes it is often desired to expand a probability 
distribution /(a:) in a series 

00 

(1) /W ~ 2 

Jfc-O 

where the flk(a:) are a given set of standard functions. Arguments of a heuristic 
nature led Charlier [4, 5, 6] to suggest that it would be useful to take the 6k{x) 
in (1) to be either the successive derivatives or the successive differences of some 
fixed function; the two oases are often referred to as type A series and type B 
series, respectively. Charlier gave formulas for determining the coefficients in the 
two cases, but the question of whether the formal series represents the given 
function in any reasonable sense has to be investigated separately for each 
particular choice of the function generating the series. Only one special case of 
each type has been much used: for the A-series, 6a{x) is the normal density 
function ; for the B-series, Bafye) is the Poisson function 1 (when x 

is restricted to take only nonnegative integral values). We shall refer only to 
these special cases when we spealc of A- and B-series in this paper. 

There are two distinct problems (which have, however, often been confused) 
connected with the representation of a function /(») by a series (1); for con¬ 
venience, we shall refer to them in this paper as the practical problem and the 
theoretical problem. In the practical problem, we have an empirical function/(re), 
defined only for a finite number of values of x, which we suspect is representable 
by co0o(x) together with a small correction, so that we hope that a few (say three 
or four) terms of (1) may give a good representation of f(x) in a relatively simple 
analytical form with a reasonable amount of computational labor. In some cases, 
and certainly with the classical A- and B-series which we ai'e considermg, we 
could represent, as closely as desired, any /(a;) (however irregular) which takes 
nonzero values at only a finite number of points; but there is no interest in doing 
this if the process involves finding too many terms of the series. (Neglect of this 
fact has led to ill-founded statements by mathematicians about the satisfactory 
nature of the A- or B-series; but see [27, pp. 38-39].) 

Thus it would be of interest to know, if possible, under what circumstances a 
given empirical density can be represented fairly well by a few terms of a series 
of a given land. If no simple criterion can be given, it is desirable to have a means 

* Address delivered by invitation at the meeting of the Institute at Boulder, Colorado, 
on September 1,1949. 


376 



REPRESENTATIONS OP DISTRIBUTIONS 


377 


of computing coefficients which will make a few terms of (1) give the best possible 
fit—best possible being defined in a way appropriate for the problem at hand. 

In the theoretical ■problem, f{x) is a function defined for all values of x, or at least 
for all of an infinite set of equally spaced values of x, arising from theoretical 
considerations which suggest Coda(x) as a reasonable first approximation to f{x). 
For example, the central limit theorem states that under certain conditions the 
cumulative distribution function of the sum of a large number of independent 
random variables is approximately normal; then we might expect that this 
distribution function would be representable by a series (1) with 8 o(x) the normal 
distribution function. For such theoretical purposes we should like to have 
criteria for the representability of a sufficiently general f{x) by a series (1), 
where representability is of course to be interpreted appropriately, as ordinary 
convergence, uniform convergence, convergence in mean square, asymptotic 
representation, etc., according to the requirements of the problem at hand. The 
larger the class of f{x) for which we can prove a representation theorem, the 
larger is the possible domain of applicability of the series to theoretical problems. 

2. The A-series. This paper is concerned with the B-series, but for comparison 
we first mention some properties of the A-series. In the case of the classical 
A-series, we have the attractive fact that the functions 9n(x) are orthogonal 
with weight function that is, 

-w 

I dx = 0 , m 9 ^ n. 

•L-oo 

In fact, e^’‘'d„{x) is, except for a numerical factor, the nth Hermite polynomial. 
This orthogonality property enables one to compute the coefficients in a series (1) 
with great ease from 

(2) n! c„ = [ f{x)e„ix)e^^‘ dx, 

J—80 

or since dn(x)e^"* is a polsmomial, from the moments of f(x). By the classical 
theory of orthogonal functions, this means that if the c„ are so computed, and we 
take iV -j- 1 terms of the series, we minimize 

(3) f e^*’[/(a;) - Fif(x)T dx 

J—00 

for all possible sums 

JV 

(4) Fif(x) = Z) c„e„(x). 

n“0 

The convergence theory of Hermite series has been thoroughly investigated by 
mathematicians, so that it would appear that in theoretical problems, in which 
f(x) is given for all values of x, we are in a position to find out everything about 
the representation of f(x) by an A-series. Also in problems of practical curve- 



378 


E. P. BOAS, JH. 


fitting, the fact that the closest approximation to/(a:) (in the sense (3)) by sums 
of the form (4) is given by choosing the coefficients according to (2) seems to 
leave no more to be said. 

However, the formal elegance of the A-series seems to be somewhat misleading. 
Even when a series converges it by no means follows that its Nih. partial sum is 
the best selection of N terms for representing a given function. Even though the 
partial sums do give the best fit in the sense of (3), it may not be desirable to 
measure the closeness of approximation by (3); some other measure of approxi¬ 
mation may be better suited to the end in view. For example, it is known that 
the partial sums of Edgeworth’s series (see [8]), which is a rearrangement of the 
A-series, are more satisfactory for some purposes than the partial sums of the 

-series with the coefficients determined by (2). More precisely, Edgeworth’s 
series furnishes an asymptotic expansion, with a remainder term whose order of 
magnitude can be estimated quite precisely, in circumstances where the series of 
orthogonal functions does not do this. Again, for practical purposes a few terms 
of the A-series sometimes exhibit undesirable properties (such as negative 
frequencies). If/(a:) is a function defined only for integral values of x, A. Fisher 
[10] has suggested and applied the idea of minimizing, not (3), but the sum 
2 "“ l/(*) ~ F„(a:)pin order to determine the coefficients of the approximating 
sums. 

3. The B-series. We can now see how the status of the JS-series resembles or 
differs from that of the A-series. Here we deal principally with a function defined 
for integral values of flo(®) = 6{x) = e~\^/x\, A$(x) = 6(x) — 0(x — 1), 
A*0(i) = A(A*~^fl(x)) and 6jb(x) = A*’e(x); 0(x) is taken to be 0 for negative 
integral x. We shall refer to this as the discrete case of the B-series. The liter¬ 
ature of the subject contains a number of rather painful attempts to put the co¬ 
efficients mto usable form, persisting even after the simple formula 

(5) c„= (1/nl) |:(j')(-l)‘X"~V, 

had been obtained, where /n„ is the nth factorial moment, 

Mn = £ f{k)k\/{k — n) 1. 

Jt—n 

Formula (6) can be derived, for example, by using orthogonality properties of 
the dr(x). We have, in fact, that 9ft(x)0mix)/di)(x) is 0 or nl X“" according as 
n 9^ morn = m. 

The parameter X in the B-series is at our disposal, and can for example be 
chosen m such a way as to improve the convergence of the series. For purposes of 
practical curve-fitting, it has been customary to choose X equal to the mean of 
the distribution/(x), a choice which makes the coefficient ci of AS equal to zero. 
Charlier also suggested other methods in which Ci and Cj, or ci, Ci and Ct are 
zero [7]. Such choices, of course, may reduce the amount of computation needed 



BEPBESENTATIONS OP ElSTRlBtJTlONS 


379 


to make use of a given number of differences in fitting a curve; aside from this 
consideration their use seems to depend on the belief that one improves the 
convergence of a series by adjusting any available parameters so that as many as 
possible of the initial terms of the series are zero. This belief does not always 
seem to be confirmed by the facts. (In particular, compare columns 2 and 5 of 
Table 1, columns 2 and 4 of Table 2, or columns 2 and 4 of Table 3.) 

The theoretical problem of what f(x) can be represented by convergent 
B-series has been studied by several authors [12,13,17,19, 20, 21, 23, 24, 26, 28]; 
the study by Schmidt [24; see also 25 and 17] gives necessary and sufficient 
conditions for the representation in the case of a nonnegative f{x), so that, at 
least in all cases of interest in statistics, the theoretical problem seems to be 
completely solved. However, one of the purposes of the present paper is to 
reopen this apparently closed problem. 

There is also a continuous version of the B-series, which is suggested by the 
fact that 

( 6 ) 6 {x) = (27r)~'e“’^ e”**" exp (Xe'“) du 

reduces to the Poisson function e~\‘/x\ for positive integral a: (and to 0 for 
negative integral x). This form of the B-series has not been much used, and its 
use is subject to suspicion since it has rather peculiar properties. In particular, 
it cannot represent, in any reasonable sense, a positive function f{z) or one which 
is too small as a: —» <» [26, 3]; since the functions which present themselves for 
representation in practice are both positive and small at infinity, the continuous 
case of the B-series looks unpromising for applications. (See also [27a], la.) How¬ 
ever, it has been applied [16]. 

The purpose of this paper is to describe some results on the B-series which 
have been obtained in a mathematical paper [3], devoted to what we have 
called the theoretical problem; some contributions to the practical problem 
will also be given in the present paper. The starting point of this investigation 
was the question of what happens if one tries to approximate a function, not 
by the partial sums of the series ( 1 ), but by some other combination of the 
first N functions 9„{x), when approximation is taken in the sense of (unweighted) 
least-squares. This method of approximation seems well adapted to statistical 
problems, and leads to simpler mathematical work than ordinary point-by-point 
convergence of the partial sums. The B-serles itself gives a least squares approxi¬ 
mation with a weight function l/daix). We consider here only the classical B-series, 
when do(x) = 6(x) = e“V/a:l, 6 „{x) = A"Po(a:); the main results are substantially 
the same for rather more general cases [3; see also 14, 25] In addition, here we 
consider only nonnegative f{x), assumed zero for negative x. Functions which 
need not be zero for negative x are handled easily by generalizing the B-series 
to the form [3] 

f(x) 2 5nV"fl(*) + 2 andJ'eix), 

-n 0 


( 7 ) 



380 


R. P. BOAS, JK. 


where V denotes the advancing difference: Vd(x) = 6(x) — 0(x + 1); there 
seems to be no particular reason (other than a historical one) for preferring one 
kind of difference to the other. The generalized series (7) might be useful for 
graduating symmetrical probability distributions, although it does not seem to 
have been considered in the literature (cf. [la]). 

4. Results: practical problem. Our question takes somewhat different forms 
in the two cases which we have described as the practical and the theoretical. 
In the former, we ask what the coefficients should be so that 

(8) t, f(x) -ta^A''6{x)' 

shall be a minimum, where f(x) is an empirically given function and N is & given 
integer, in general not veiy large. If iV is 0, 1 or 2, that is, if we use 1, 2 or 3 
terms, the best choice of the in ( 8 ) can be calculated without difficulty. 

For N = 0, our question is that of finding the best least-squares fit to /(a;) 
by a Poisson distribution the best choice of is then 

(9) ar « /-^0(2^^), 

where 

Uiy) = 1 + 2 / 7 ( 21 )’ + 2/7(41)* -f ... 

{Jo denotes the Bessel function of order 0); on the other hand, the usual formula 
(5) gives the different coefficient 

G0 

Co = MO = £ fix). 

This, of course, is simpler than (9) to compute, although its use is based on the 
uncritical assumption that the first term of the series ( 1 ) is the best one to take 
if only one term is to be used. Charlier [7; see also 10, pp. 101-103] suggested a 
different formula in which one uses, not A*0(a:), but A*0(pa; + q), the parameters 
p, q, X being adjusted to make the terms of (1) in A6, A*fl, A*0 all zero; here OCx) 
is defined when x is not an integer by inteipreting e"^X*/a:l as e~V/r(a: -f 1 ), 
and not by using formula ( 6 ). Table 2 shows that in at least one numerical case 
(9) gives a better least-squares fit than Charlier’s method (and without intro¬ 
ducing gamma functions to take care of d{x) for fractional x). However, it is 
not excluded that Chaiiier’s method will give better results in other cases, 
since with the change of the functions 0a(x) the results of this paper cease to 
apply. 

For iV = 1, we get the best least-squares approximation to f(x) by 

ao^^0(x) -h oi”Afl(x) 



BEPBESENTATXONS OF DISTBIBUTIONS 


381 


__ _ 

= —j —3 (So + Si)> 

( 10 ) “ + . 

where So = f(x)d(x), Si = Sr-o/(®)^(a: — 1), a = Jo(2i\), ^ = 

—iJi (2iX), the J’b again denoting Bessel functions. For N = 2, the corresponding 
formulas involve also 7 = —Jii^iK) and So = S^/(®)^(® ~ 2). They are: 

„(« _ <3 — « 2/9 — g — 7 v* /3 — a S 

2/3^ — — a7 2|8* — a* — a 7 2j9* ^ or ^ ay 

- 2 X ( 2 ) ^ Py — afi +2^ — 2ay a + y — 20 

^ (a — 7)(2/3“ — a* — ^ 7 ) ° 20^ — a* — ay ' 


2 a* -2^ + 0y - a0 
2/3* — a* — a 7 


_ _ ay - 0 _ _ p y> 

* (a — 7)(2/3* — a* — a7) ° 2/3* — a* — a7 ^ 

, /9* - a* 


1 _ p - a _ Y 

(a — 7 )( 2 / 3 * — a* — ay) 


The functions i’'Jn('iy) are real for real y, and extensive tables are available [32]. 

Some numerical examples showing the comparison between graduation by 
these formulas and by the corresponding number of terms of the 5-Beries are 
given in Tables 1-3. It will be noticed that (as the theory indicates) one gets a 
better least-squares fit by formulas (9), (10) or (11) than by a corresponding 
number of terms of the B-series using the coefficients (5). However, one may not 
get a better fit if goodness of fit is measured in some other way, e.g. by x*- 
Unfortunately the coefficients calculated by this method increase rapidly in 
complexity as the number of terms increases, and even the coefficients for iV = 3 
would involve very heavy algebra. Since numerical examples [2] indicate that it 
is often necessary to go to terms in for a satisfactory fit, it might be worth 
while to calculate the next few coefficients. 


6 . Results: theoretical problem. In the case of a theoretical distribution we 
ask how coefficients should be determined so that 

( 12 ) Z fix) - Z ai"’A*3(a;) 

SC—0 A—0 

will tend to 0 as IV —!► «>. The convergence to 0 of (12) is a rather strong kind of 
convergence, since it implies convergence of the approximating sums to fix), 
not only for each x, but even uniformly for all ®. Of course, the “best” choice of 



382 


R, P, BOAS, JR. 


as above ■would be expected to gi've convergence under the weakest hypothe¬ 
ses, but because of the complexity of these coefficients it seems desirable to 
make (12) only approximately a minimum; this actually makes no difference 
in the limit, although the approximation is not usually satisfactory for small 
values of N. To see the connection between the formulas used here and the 
“classical” formula (6) for the coefficients in (1), we note that (5) can be 'written 

(13) ^ ±m £ [/ ; 

n\ jb-o 02" 


(6) results if we expand the derivative by Leibniz’s rule and rearrange the sum. 
If we expand in a power series before differentiating in (13), we obtain 


Jfc—0 niax(Jfc,n) \^/ I—o(t rC) 1 


If now we break this series off at n = ^ to obtain 


(14) = 

we obtain a sequence of approximations to /(x) by sums oi'^^A*'ff(x) which 
has, in general, much better convergence properties than the partial sums of the 
B-series with coefficients a„ given by (6). In particular, if /(x) = 0 for x = — 1, 
—2, • • ■ , this sequence of approximations converges to /(x) whenever £x-o 1 /(x) |* 
converges; on the other hand, for nonnegative /(x) it is known [24] that the 
jB-series converges if and only if limx-«/(x)2*x* = 0 for A; = 0,1, 2, • • • , a much 
more restrictive condition. If we demand that the partial sums of the B-series 
converge in mean square, that is, that (12) tends to zero with independent of 
N, we have the even more restrictive condition [3] that lim supx-.» {/(x) g 
The approximating sums with coefficients (14) have the additional property 
that they reproduce /(x) exactly for x = 0, 1, 2, • • • , N. One would expect 
that in general they would then tend to deviate rather ■widely from /(x) for 
larger x, and so would not be satisfactory for practical curve-fitting. However, 
it seems possible that if we fit such a sum not to /(x), but to /(px -1- q), with 
suitable integers p and q, thus making the approximation agree with /(x) at a 
set of values covering the whole range of definition of /(x), it might give a satis¬ 
factory fit elsewhere. This possibility has not been investigated; a similar 
approach using the partial sums of the B-series was suggested by Charlier [7] 
and Fisher [10]. 


6. The continuous case of the B-series. In the continuous case we again ask, 
not when 


/(x) = 5^ OnA"e(x) 


n»0 


(15) 



HEPRBBBNTATION8 OP DISTRIBUTIONS 


383 


with uniform convergence in every finite interval, but when 
(16) f{x) = l.i.in. 2 A"6{x), 

n-O 


which means that 


(17) 


lim I f(x) — ^a^n^A’'6(x) 


dx = 0. 


For (16) the following negative results are known [26]: if f{x) ^ 0, (15) cannot 
converge uniformly on every finite interval (unless fix) = 0); the aeries, if 
convergent uniformly on every finite interval, cannot converge to fix) unless 
the Fourier transform of fix) vanishes outside (—*■, ir), a condition which 


TABLE 1 


Number of petala on buttercups. X = .631 




2 

3 

4 

5 

6 


1 

Calculated 

Calculated 

Calculated 

Calculated 

Calculated 

X 

Observed 

3 terms 

1 term 

2 terms 

3 terms 

3 terms 


frequency 

(formula 

6) 

(formula 

0) 

(formula 

10) 

(formula 

11) 

(formula 

14) 

5 

133 

134.9 

119.9 


132.9 

133.0 

6 

55 

51,6 

75.6 


65.3 

55.0 

7 

23 

22.5 

22,5 


22.1 

23.0 

8 

7 

9.6 

6.0 


8.5 

9.1 

9 

2 

2.9 

0.8 

0.0 

2.4 

2.6 

10 

2 

0.6 

0.1 

0.0 

0.6 

0.5 

Total. 

222 

222.0 


207.7 

221.7 

223.2 


automatically excludes any fix) which vanishes for all large | a; | or even is too 
small as a: —+ «. Nevertheless, J0rgensen [15] applies the continuous case success¬ 
fully to practical problems. A possible explanation of this apparent discrepancy 
is that if the in (16) are properly determined, (16) will be true under fairly 
general conditions. To be sure, the mean square difference in (17) cannot be 
made arbitrarily small unless the Fourier transform ^(a:) of/(a:) vanishes outside 
(—IT, it), but if I fix) 1’ is integrable the difference can be made small if gix) is 
itself small. If gix) does vanish outside (— tt, v), then (16) is true; and in fact 
the coefficients can be taken the same as in (14), so that the approximating 
sums depend only on the values of fix) for integral values of x; these values are 
known to determine/(x) under our hypotheses on gix). 

7. Discussion of some numerical results. Table 1. Column 2 gives the fit by 
two terms of the B-series (really three, since the coefficient of AB is zero when 
















384 


R. P. BOAB, JR. 


formula (6) is used), as calculated by Charlier [7] (that is, using terms through 
A®e). Column 3 gives the best least-squares fit by a single term, i.e., a Poisson 
distribution, calculated by formula (9); it is clear that this term alone does not 
represent the observations very well. Column 4 gives the best least-squares 
fit by terms through A&. Column 6 gives the best least-squares fit by terms 
through a“ 6; the improvement over Charlier’s fit by the same number of terms 
is evident by inspection. Column 6 gives, for comparison, the same number 
of terms calculated by formula (14), which gives an approximation to the best 
least-squares fit and necessarily reproduces the data exactly for the first three 

TABLE 2 


Failure of grains of barley. \ =■ 2.757 


X 

1 

Observed 

frequency 

2 

Calculated 

4 terms 
(Charlier) 

3 

Calculated 

1 term 
(Formula 0) 

4 

Calculated 

2 terms 
(Formula 10) 

6 

Calculated 

3 terms 
(Formula 11) 

0 


— 

47.3 

49.9 

48.4 

1 



130.4 

134.7 

133.4 

2 

180 

174 

179.8 

181.6 

182.3 

3 

170 

161 

165.3 

163.2 

164.3 

4 

111 

111 

113.9 


109.8 

5 

60 

60 

62.7 

69.3 

58.1 

6 

22 

32 

28.8 

26.6 

26.2 

7 

22 

14 

11.4 

10.2 

9.3 

8 

7 ! 

6 

3.9 

3.4 

2.9 

9 

2 

2 i 

1.1 

1.0 

0.8 

10 

1 

0 

0.3 

] 

0.2 

0.2 

Total. 

749 

752 

744.9 

740.0 

734.7 


values of x. The fact that (14) gives good results here is presumably connected 
with the small size of A. 

Table 2. Column 2 gives the values calculated by Charlier [7] for a fit after 
the linear transformation x px g, with X, p and q chosen to make the terms 
in Afl, A^ff, A 6 all zero (the values were read to the nearest integer from Charlier’s 
graph). Column 3 gives the best least-squares single-term fit calculated by 
formula (9); this is a considerable improvement for x ^ 6, but for the remainder 
of the table it is rather poor. Column 4 gives the best least-squares fit by two 
terms; column 5, that by three. The x*-test indicates that the graduation is 
rather poor in all cases. 

Table 3. Column 2 gives the classical calculation Tvith terms through A*5; 
this was given by A. Fisher [10] and (more accurately) by Aroian [2], Columns 3 












EEPHESBNTATIONS OF DISTRIBUTIONS 


385 


and 4 give the best least-squares approximations by two and three terms; 
column 4 is better than column 2, in this sense, as expected. However, column 4 
is a poorer fit when tested by x*i chiefly because of the poor fit at a: = 0. It should 
be noted that two more terms of the H-series give a more satisfactory fit [2]. 

TABLE 3 


a-particlea from a iar of polonium. X = 3.87155 


X 

1 

Observed 

frequency 

2 

Calculated 

3 terms 
(formula 6) 

3 

Calculated 

2 terms 
(formula 10) 

4 

Calculated 

3 terms 
(formula 11) 

0 

57 

49.6 

51.3 

45.2 

1 

203 

201.3 

213.3 

190.9 

2 

383 

403.4 

399.0 

393.5 

3 

525 

532.3 

524.8 

629.8 

4 

532 

520.6 

517.2 

525.4 

5 

408 

402.6 

407.7 

409.7 

6 

273 

254.8 

267.7 

261.9 

7 

139 

137.1 

150.6 

141.1 

8 

45 

64.0 

74.1 

65.3 

9 

27 

26.1 

32.4 

26.3 

10 

10 

9.4 

12.8 

9.3 

11 

4 

3.0 

4.6 

2.9 

12 

0 

0.9 

1.5 

0.8 

13 

1 

0.2 

0.6 

0.2 

14 

1 

0.0 

0.1 

0.0 

Total. 

2608 

2606.2 

2657.6 

2602.3 


X* = 10.2 

n = 7 

X* = 16.2 

n = 8 

X® = 11.4 
n = 7 


8. Proofs: theoretical problem. We now outline the proofs of the results which 
we have stated. They depend on the fact that the numbers 6 {x} {x = 0, ±1, 
±2, • • ■) (where d{x) = 0 when a: is a negative integer) are the Fourier coefiicients 
of the function <p{u) == e~^ exp (Xe*“), i.e. 

e(x) = (2ir)"‘ du, x = 0, ±1, ±2, ••• 


A’^dix) = (2t)-' [ <p{u){I - du. 


Furthermore, 












386 


H. P. BOAS, Jja. 


If we then assume the condition | /(®) P < with f(x) = 0 for x = 
— 1, —2, ■ • • , the numbers f{x) are the Fourier coefficients of a function g{x) 
of integrable square, by the Rieaz-Fischer theorem from the theory of Fourier 
series [31, p. 74]: 

f{x) = (27r)“* f g(u)e ““ du, a; = 0, d= 1, 2, ■ • •, 


Thus 


(18) fix) - £ ai^^A'^eix) 


j-o 


= (2t)-' £e-” [(7(w) - <piu) E a^'Cl - c‘“)"] du, 

and so the expressions on the left appear as the Fourier coefficients of the expres¬ 
sions in square brackets on the right. By Parseval’s theorem for Fourier series 
[31, p. 76], then, we have 


(19) Z) /(a:) - ai^^A'‘d(x) 

b-a 


= (2ir)"^f g(u) — v>(u) E — e'“) 

J-T 1—0 


du. 


Thus we have reduced the problem of minimizing the mean-square difference 
on the left of (19) to that of minimizing the integral on the right of (19). By 
rearranging the sum in the integrand, we see that an equivalent problem is to 
minimize 

(20) D = (27r)-^ r I g(u) - v>(u) E 6”=“ ' du, 

V—ir 1 Aj—O 

where the and are readily expressed in terms of each other; in fact, 

( 21 ) _ = 

Since [ <p(u) [ = 6"^+’“’“*“ ^ > 0, we can write D in the form 



D - (2,r £ 

giu)/<p{u) - 

■ E cr* 6’*=“ 

A)-0 

2 

1 ifiu) f du, 

so that 






1 


2 



/ \giv)/<piu) - 

E cr’ e’‘*“ 

du ^ 27 rZ) 


(22) 



.V 1 

V 




= g(w)MiO - Ecr^' 

;:-o 


du, 



HEPBESENTATIONS OF DISTRIBUTIONS 


387 


since e ^ \<p(u) \ ^ 1. Thus we can make D arbitrarily small if and only if 
we can make 


(23) 


D* = (2^r^ £ g(u)Mu) - Z cr’e**" 


z 

t-0 


du 


arbitrarily small. Now the Fourier coefficients of g(u) are /(i); those of l/(p{u) 
are \ for x ^ 0, 0 for x < 0; by the convolution theorem for Fourier 

coefficients [31, p. 90] the nth Fourier coefficient of g(.u)/(p(u) is 


(24) 


i:/(n - k)e\~ X)Vfcl, n = 0, 1, 2, •. • , 

h-a 


and zero for n < 0. Furthermore, it is well known from the theory of Fourier 
series that D* is a minimum if are chosen as the first N + 1 Fourier coeffi¬ 
cients of g{u)/<p{u), and that this minimum is arbitrarily small for large enough N 
if and only if the Fourier coefficients of g{u)/<p{u) are zero for negative indices 
—^which is in fact the case. If we then take the values (24) for fc = 0,1, • • • , 
N, and express in terms of by (21), we arrive at the formula (14). 

It will be observed that the minimum D is connected with the minimum D* by 

min D ^ max 1 <p(,u) 1 • min D'* g min D* g — mm D ^ 2 x jy 
' ' mm ip(m) 


so that all that we can say about the approximation given by (14) with a small N 
is that it is an uppe? bound for the best possible mean-square approximation by 
sums (18), and that the best mean-square approximation is at worst e“^ times 
it This means that if D* is small, so is D; but D* is not necessarily small even if 
D is. Hence we cannot in general expect the coefficients (14) to be suitable for 
practical curve-fitting, since they may increase the mean-square error by a 
factor of as much as we may, however, expect (14) to be better when X 
is small. 

Now, as we have already observed, 

fi^) — Z 4^^’ A*" 6{x) 

k-O 


is the xth Fourier coefficient of 

9{u) - <p{u) tar (1 “ ey ; 

if we write (18) in the form 

(25) fix) ~ tar A'‘eix) = r e-^‘\ git) Mt) - t arH - c“)''L(«) dt, 
k~9 J-r L 1-0 J 

and choose the ar as specified above, the expression in square brackets is 
git)/ipit) minus the first N 1 terms of its Fourier series, and so the Fourier 



388 


H. P. BOAS, JR. 


series of [• • •] involves no e'^‘ with, /b < iNT + !• Since the Fourier series of <p(t) 
involves no e*' with k < Q, the product ■ ■] also involves no e'*'‘ with 
A; < Ai -|- 1, and therefore the integral in (25) is zero for * = 0, 1, 2, • ■ ,N 
(since it represents the .'cth Fourier coefficient of <p(0[' ■ '])■ In other words, 

/W - E ar’’A"fl(a;) = 0, a; - 0, 1, 2, ■ • AT 

i-o 

Furthermore, we can compute/(x) — StU for x > AT by the convolu¬ 

tion formula from the Fourier series of ipif) and [•••]; forn > N, the nth Fourier 
coefficient of [■ ■■] is just that of g{i)/<pit), given by (24), and that of cp{t) is 
e'\’'/n \ so for K > Ai 

i{^) ~ r ar^A*fl(x) = E (i/(f - k)e^i~ \f/k[) 0(x - 1) 
l!-Q l-AT+l V-0 / 

and in particular 

f/ A^+l 

f(N + 1) - E ar>A‘=5(Ar + 1) = E /(AI + 1 - fc)(- X)7^:!. 

*1-0 i-O 


9. Proofs: practical problem. We have so far obtained only an estimate for 
the minimum of D, by obtaining the mmimum of D *; this estimate is satisfactory 
for large JV and so for theoretical purposes. However, to obtain precisely the 
best mean-square approximation to f(x) by a small number W of terms of the 
sum in (18), we have to choose so that 

E ar’ (1 - ey <p(t) 

k-0 


is the first iV + L terms of the expansion of g{t) in terms of the set of functions 
obtained by replacing (1 — A; = 0, 1, 2, - • • , by an equivalent ortho¬ 

normal set. The process for obtaining this orthonormal set is well known, it 
turns out that the integrals that have to be evaluated are expressible in terms of 
Bessel functions of imaginary argument; the result is that the first orthonormal 
functions are 

^o(0 = (2ir)''*a“* exp (Xe"), 


hi0 = (2ry 


jal — aoai ~ ai(ao — a2)e‘‘ — (ai — 




[(«! — ao)(ao — 0 : 2 ) (2a!? — «§ — q!oq:2)]' 


5-7- exp (Xc-O, 


where oo = Jo(2fX), m a 2 = —J2(2iX). It is then a simple matter, 

first to express \Ao, f 1 , *^2 in terms of <p(i), - e'‘), <p(()(l - e'7, and then 

to determine 00 °^ ao*’, and of’, For example, the best two-term 



EEPRESENTATIONS OF DISTRIBUTIONS 


389 


approximation for g(u) in terms of i/'o(«), ^i(u) is 

g{u) ^ [ g(u}^o(u) die + ^pi{u) f g(u)^i(u) du, 

and the integrals j g{u)^i{u) du are combinations of terms of the form 

(2t)“^ v>(u) du; 

these in turn are Fourier coefficients of g(u)(p(u) and so are expressible, by the 
Parseval formula, as products of the Fourier coefficients of g{u) (namely, fin)) 
and of <p{u) (namely, &(n)) We omit the algebraic work; the results are given in 
formulas (9), (10), (11). 


10. Proofs: continuous case. In the continuous case of our approximation 
problem we assume that \fix) p is integrable on (— «, ») and look for coeflS- 
cients that will minimize 


D= f fix) — A'‘6ix) dx, 


where 


0(a:) = (2,r)-‘ f’ <p(u)e-’*“ 


du, 


A'‘eix) = (27r)-‘ [\iu)e-"'‘il - du. 
J—T 


Let fix) be the Fourier transform of giu); we can regard 0(x) as the Fourier 
transform of ipiu), (fiu) being defined as zero outside (—tt, ir). Then by Parseval's 
theorem for Fourier transforms we have 


2wD = f \git) dt + f git) - (pit) - 

J|<|>T J-D 1=0 


,.'()t 


dt. 


Clearly, then, D cannot be made arbitrarily small unless git) = 0 almost every¬ 
where outside (—■"■, ir); and if this condition is satisfied, D reduces to the same 
form which it had in the discrete case—see (19). Thus the problem of mean-square 
approximation in the continuous case reduces, if it can be solved at all, to the 
corresponding problem in the discrete case. 


11. Representation by a series. We consider the representation of a given 
fix) by the P-series with the classical coefficients (5), but with mean-square 
convergence of the series. Here we assume that fix) ^ 0, fix) = 0 iov x = 
— 1, —2, • • • , and f/(i')]^ < ask whether we can have 

|2 


(26) 


lim Xi 

fl—*00 CO 


fix) — J^aiA^eix) 
1^0 


= 0 , 



390 


B. P. BOAS, JR. 


where here the 0 * do not depend on n (but are not, in principle, required to have 
the form (5)). From our previous discussion this is equivalent to 

lim i I g{i) — ip{C) S ■" s’")* = 0, 

n-*« I 

and this implies that 

lim ia„r /‘V(oni = 0. 

n-*«o J—» 

From this it follows easily that 

r a„(l - e^')” 

n"0 

converges for j < j < ir, or in other words that 

Bi.z) = E an(l - zT 

n-O 

converges on | z ] = 1 except perhaps for 2 = —1, and hence converges in 
11 — z j < 2 By analytic continuation it is easy to identify H{z) with F( 2 )$(z), 
where for 1 z | < 1, 

m = E /(n)a”. ^( 2 ) = E 0(n)z" = . 

n"»0 

Since l/4'(z) has no singular points, F{z) is analytic in 11 — z | < 2 and hence in 
particular in 0 ^ a; < 3; since F(z) is a power series with nonnegative coefEcients, 
it has a singular point at the positive real point on its circle of convergence 
[30, p. 214], and so it must be analytic at least in [ z [ <3. This gives the restric¬ 
tion lim 8up„-.« l/(n) 1’^" ^ Nevertheless, as we know, /(») is represented 
in mean-square by a sequence of sums of terras even if we assume 

only that S [ /(n) P converges. 

In the continuous case, if fix) ^ 0 and we have 

(27) lim f /(x) — EofcA^^C.T;) dx = 0, 

n-*« J_oo I 

we must have g(x) = 0 almost everywhere outside (—ir, ir) and then, as we saw 
previously, (26) holds also. Now since /(*) ^ 0, g{t) has derivatives of all orders 
if it has derivatives of all orders at < = 0 [29, p. 90] and it is easily seen from 
this that g(i) is analytic for all real t if it is analytic at < = 0. Now on the one 
hand, unless/(a:) = 0, g(t) cannot be analytic for all real t if (as we are supposing) 
g(i) vanishes outside (—tt, tt). Qn the other hand, B(e'‘) = g(i)/v(() for real 
values of t close to 0 and so, if t is regarded afl a complex variable, for complex 
values of t near 0. Since l/v>(<) is analytic everywhere, g(t) is analytic at < = 0. 
From this contradiction we infer that a nonnegative/(x) can never be represented 
in the form (27), although it may perfectly well be represented by 

lim f |/(s) — E (ik"^£>!‘6ix) dx = 0. 

J-eo I A:—0 



REPRESENTATIONS OP DISTRIBUTIONS 


391 


RKFERENCE8 
I. Charlier senes 

[1] A C. Aitken, Slalislteal Mathematics, Oliver and Boyd, London, 1939, pp 68-69, 

66-67, 76-79. 

[la] W ANDBRasoN, “Short notes on Charlier’s method for expansion of frequency functions 
in series,” Skandinavisk Aktuar, lids,, Vol. 27 (1944), pp 16-31. 

[2] L. A. Aroian, “The typo B Gram-Charlier series,” Annals of Math. Slat., Vol. 8 

(1937), pp 183-192, 

[3] R P. Boas, Jr , “The CharherB-aeries,” A»». Malh. Soc. Trans, (to appear) 

[4] C. V L Charlier, "tlber das Fehlergesetz,” Arl: Mat. Aatr. Fys., Vol. 2, no. 8 (1905) 

[5] C. V. L Charlier, “Die zweite Form des Fehlergesetzes,” ibid , no 16 (1905). 

[6] 0. V L. Charlier, “Uber die Darstellung willkurhcher Funktionen,” tbid., no. 20 

(1905) 

[7] C V. L Charlier, “Researches into the theory of probability,” Lund Umversitets 

ArsskTift N F , Afd 2, Vol. 1, no. 6 (1906). 

[8] H. CRAMfiR, Random Variables and Probability Distributions, Cambridge Tracts in 

Mathematics and Mathematical Physics, no. 36, Cambridge University Press, 
1937. 

[9] W. P Eldbhton, Frequency Curves and Correlation, 3d ed., Cambridge University 

Press, 1938, pp. 131-132 

[10] A Fisher, An Elementary Treatise on Frequency Curves and their Application in the 

Analysis of Death Curves and Life Tables, Macmillan, New York, 1922 

[11] A Fisher, The Mathematical Theory of Probabilities and Its Application to Frequency 

Curves and Statistical Methods, Vol. 1, 2d ed , Macmillan, New York, 1922, pp. 
271-279 

[12] M. Jacob, “Uberdie Chwlier’soheB-Reihe,” (Sfcondinoml; AWwor Ifds, Vol,16 (1932), 

pp. 286-291 

[13] C Jordan, “Sur la probability des ypreuves rypdtyes, le thyorferae de Bernoulli et son 

inversion,” Bull. Soc. Math, France, Vol. 64 (1926), pp 101-137 

[14] N R. JfjRGENSEN, “Note sur la fonction de rypartition de type B de M. Charlier,” 

Ark. Mat. Astr. Fys , Vol. 10, no 15 (1915). 

[16] N. R. JORGENSEN, “Undersdgelser over Frequensflader og Korrelation,” Copenhagen 
thesis, 1916. 

[16] M. G. Kendall, The Advanced Theory of Statistics, Vol. 1, 3d ed , Griffin, London, 

1947, pp 154-156. 

[17] S. Kullback, “On the Charlier type B series,” Annals of Math Stat., Vol. 18 (1947), 

pp 574-581, Vol. 19 (1948), p. 127 

[18] R VON MisEs, Vorlesungen aus dem Gebiete der angewandten Mathematik, Vol 1, Wahr- 

Bcheinlichkeitsrechnung und ihre Anwendung in der Statistik und theoretischen 
Physik, Deutioke,Xeipzig and Vienna, 1931, pp. 265-269. 

[19] N. Obrechkofp, Sur la loi de Poisson, la sine de Charlier et les equations linhaires auz 

diffirences finies de premier ordre A coefficients constants. Actual. Sci. Ind., no. 
740, Hermann et Oie , Paris (1938), pp. 35-64. 

[20] H Pollaozbk-Geirinoeb, “Die Charlier’sche Entwioklung willkurhcher Vertei- 

lungen,” Skandinavisk Aktuar. lids., Vol. 11 (1928), pp 98-111. 

[21] H Pollaczek-Geiringer, “Uber die Poissonsche Verteilung und die Entwioklung 

willkurhcher Verteilungen,” Zeits. f. Angew Math, und Mech , Vol. 8 (1928), 
pp. 292-309 

[22] H L, Ribtz, Mathematical Statistics, Carus Mathematical Monographs, no. 3, Chicago, 

1927, pp. 60-68,170-177 

[23] E Schmidt, tlber die Charliersche Entwicklung einer "arithmetischen Verteilung” nach 

den sukzesstven Differengen der Poissonschen asymptotischen Darstellungsfunklion 



R, P. BOAS, JR, 


/lir k Mmkwldkii sellmr Ereignme, Sitzungsbenchtc der Prcassischcii 
Akademie derWissenfichafteii,Pliys -Math KL1928, p, 148. 

[24] E, Schmidt, "tlber die Charllei-JordansclicEntwickluiig eincr willkiiilichen Funktum 
Hack der Poisflonechen Funktion und iliren AbleitUHg," Zcits. j. Artpeu;, ilk/i, 
Mtidlecli.Vol.U (1933), pp. 139-142. 

[26] H, L. Sulbehg, ‘'tlber die Darstellucg willkilrHoher Fanktionen durcb Charliersclie 
Differenzreihen,’’ Shndinatiisk itoar. lids., Vol. 25 (1042), pp. 228-246. 

[ 26 ] H, L. Sblbehg, "tlber dieDarstellungderDicbtefiinktion einer Verteilung durcli eiae 

CharlierscheB-Beihe," ArcMvforMalkmlik ogMumdenM, Vol 46 (1943), 
pp. 12H38. 

[27] J. F. Steffensen, Som Recent Eeeearcks m Ik Theory of Stdielics and Aelwrial 

Science, Cambridge University Press, 1930, pp Sd-lS. 

[27a] J,F,STEFFBNSEN,''Notesurlafonctiondetyp 0 Bd 6 M. Charlier,” Svenh iter. 
Ms,,Vol,3(1916),pp 226-228. 

[28] J. V. Uspensky, "On Charles Jordan's series for probability," Amok of Math, Series 

2, Vol, 32 (1931), pp 306-312. 

II Other refmnces 

[29] H. Cramer, MolhcmticalMethods of Statistics, Princeton Univeisity Press, 1946. 

[30] E, C, Titchmarsh, The Theory of Funclions, Oxford, 1932, 

[31] A. Zygmdnd, Trigomietncal We8,War8zawa-Lw6w, 1935. 

[32] Table of the Sessel Funclions Jo(i) and J\{t) for Complex Arguments, prepared by the 

Mathematical Tables Project, National Bureau of Standards, 2d ed, Columbia 
University Press, New York, 1947. 



HEURISTIC APPROACH TO THE KOLMOGOROV-SMIRNOV 

THEOREMS' 

By J. L. Doob 
UniversHy of Illinois 

1. Introduction and summary. Asymptotic theorems on the difference between 
the (empirical) distribution function calculated from a sample and the true 
distribution function governing the sampling process are well known. Simple 
proofs of an elementary nature have been obtained for the basic theoiems of 
Komogorov“ and Smirnov^ by Feller/ but even these proofs conceal to some 
extent, in their emphasis on elementary methodology, the naturalness of the 
results (qualitatively at least), and their mutual relations. Feller suggested that 
the author publish his own approach (which had also been used by Kac), which 
does not have these disadvantages, although rather deep analysis would be 
necessary for its rigorous justification. The approach is therefore presented (at 
one critical point) as heuristic reasoning which leads to results in investigations of 
this kind, even though the easiest proofs may use entirely different methods. 

No calculations are required to obtain the qualitative results, that is the 
existence of limiting distributions for large samples of various measures of the 
discrepancy between empirical and true distribution functions. The numerical 
evaluation of these limiting distributions requires certain results concerning the 
Brownian movement stochastic process and its relation to other Gaussian 
processes which will be derived in the Appendix. 

2. The problem. Let aii, vs, • • • be mutually independent random variables 
with a common distribution function F(X), 

F{X) = Pr[xj < \}. 

In statistical language .ri , , Xn form a sample of n drawn from the distribu¬ 

tion with distribution function F{X) Let r„(X) be the number of these a:,'s which 
are < X According to the strong law of large numbers, for each X 

(2.1) lim = F(X) 

71 ^ 

with probability 1, For fixed n r„(X)/„ is itself a distribution function (which 
depends on the sample values Xi, ■ • ■ , x„) the empirical distribution function, 
and an elaboration of the argument which led to (2.1) shows that (2.1) is true 

1 Research connected with a probability project at Cornell Univeraity under an ONR 
contract 

® Insi, Ilal. Atli , Giorn , Vol 4 (1933), pp. 83-91, 

^ Rec Math. (Malernaiiceskn Sbornik),'Si(.8 6, Vol 48 (1939), pp 3-26, Bull Math. Umv 
Moscoii, Vol 2 (1939), fasc 2 

* Annals of Math. Siat , Vol 19 (1948), pp 177-189 

393 



394 


J. L. DOOB 


uniformly in X, with probability 1; that is if 
(2.2) I>„ = L.U.B. "A) _ F(x) 

then Dn is a random variable and 

lim Z)« = 0 

n-»0Q 

with prohabiliiy 1.® This result would be of limited practical statistical importance 
except that the distribution of does not depend on the distribution function 
P{\) if f(X) is continuous. In fact in that case the random variables F(xi), 
F(xi), • ■ ■ are mutually independent and each is uniformly distributed in the 
interval (0, 1); if v„(X) is the number of F(xj)’a < X, for j < n, 

L.U.B. - p = L.U.B. - Fix) 
n —n 

Thus it is no restriction, replacing Xj by Fixj) if necessary, in finding the distri¬ 
bution of Z)„ to assume that i'’(X) = \ for 0 < X < 1, and 

(2.20 = L.U.B. -X 

ogxsi n 

The results will hold for Un defined by (2.2) for any continuous F{}C). We shall 
also consider Dt and DK , defined by 


Dt = L.U.B. - xl, 

o^xfii L ^ J 

B- = -G.L.B.r5^ - xl, 
L n J 


and again the results will hold (with the obvious definitions of DX and in the 
general case) for every continous F(X). 

The problem is to find the limiting distributions of (properly normalized) 
Dn,Dt, B'^ when n . 


3. Derivation of the Kolmogorov and Smirnov theorems. Define 

Xr^H) = ni - l^, 0 <t<l. 

Since i'„(0) = 0 with probability 1 and v„{i) — vn{s) is the number of suc¬ 
cesses in n independent trials, with probability i — s of success in each trial, 
Vn{i) — v„is) has expectation n{t — s) and variance nit — a) [1 — (< — s)]. Hence 

^!a;.(0)=0, 0<<<1; 

-Bf[a:n(0 - a:n(s)f} = it - s) [1 - it ~ s)], 0 < s < ( < 1. 

‘ Cf. M. Fr^chet, GintralUks sur lea probabihlis. Variables alialoires, Paris, 1937, pp, 
260-261. 



KOLMOGOROV-SMIHNOV THEOREMS 


395 


Now let {2:(0} be a one parameter family of random variables, 0 < i < 1 
with the following properties: 

(a) for each y if 0 < < • • • < < 1 the j-variate distribution of the random 

variables x((i), ■ • • , x(t;) is Gaussian; 

(b) (3.1) holds, that is 

(3.1') S{x(0} = 0, 0<<<1; 

£?{[a:(<) — x(s)fl — (< — s) [1 — (t — s)], 0 < s < i < 1. 

(c) Pr{x(0) = 0} = 1. 

According to the central limit theorem, the j variate distribution of 
:^n(h), ■ • ■ , is asymptotically that of x(ii), • ■ •, x(tj); in fact the normalizing 
factor rI in the definition of x„(<) and the choice of means and variances in (3.1') 
were made precisely to bring this about. As far as first and second moments are 
concerned the x„(t) and x(t) processes are identical; when n —> co the distribu¬ 
tions, or at least the j variate ones mentioned, become identical also. 

We shall assume, until a contradiction frustrates our devotion to heuristic 
reasoning, that in calculating asymptotic Xn(t) process distributions when n —» qo 
we may simply replace the Xn(t) processes by the x{t) process. It is clear that this 
cannot be done in all possible situations, but let the reader who has never used 
this sort of reasoning exhibit the first counter example 
The x(t) process has continuous sample functions (cf. Appendix). Define 

D = Max 1 xit) I, 
os tsl 

D* = Max x(0, 

0S«:S1 

D~ = —Min x{t). 

0:S J:SI 

Then in accordance with our substitution principle r*Z)„ , nWt, n’>DZ have asR 
becomes infinite the distributions of D, D^, D~ respectively. (The latter two 
are the same because the —x{t) process is stochastically identical with the x{t) 
process.) Thus these simple qualitative considerations have led to the existence 
of the limiting distributions derived and evaluated by Kolmogorov, who proved: 
Theorem* (Kolmogorov) 

(3 2) lim Pr{nWn > X) = 2 f) (-1)"*+' ; 

7j->ao 1 

(3.3) lim Pr[n^Dt > X) = lim Pr{nD'^ > X) = 

n—« 

To complete our treatment we shall prove in the Appendix that 
(3.2') Pr{2) > X) = 2 Z 


“ In Feller’s paper (loc. ail., p. 178, equation (1 4)) the factor 2 in the exponent was 
omitted by the printer, The same misprint occurs in Smirnov’s table of the values of the 
series m our (3 2), Annals of Math Slat , Vol 19 (1948), pp 279-281. 



396 


J. L, DOOB 


(3.30 Pr{Z)+ > Xj = Pr{D~ > X} = 


BO that in fact the above considerations have led not only to the existence but 
to the evaluation of the asymptotic distributions. (Actually we shall prove 
somewhat more general results about the x(t) process.) 

So much for the Kolmogorov theorems Smirnov obtained results (also 
independent of the given continuous distribution function P(X)) of a somewhat 
different nature. Let xt, xt, ■■■ be mutually independent random variables 
with the same individual distributions as the ai/s, that is each distributed 
uniformly in the interval (0, 1); define vtOO as the number of the first ti k/s 
which are < X Smirnov considered the difference between empirical distribution 
functions, 


Dmn — L.U.B. 


)'m(X) I^n(X) 


m 


as well as Dtnn and Z)„.„ defined in the obvious way. To avoid stressing the 
obvious we consider only the Dmn. 

"tYl 

Theorem (Smirnov). If m, n ^ in such a way ihai — —» r, and if 

TL 

N = mnf(m + n), 


(3.4) 


oo 


lim Pr{mD,nn > XI = 2 D (-1)"+' 


0 » 


To derive this result define an x*{i) process stochastically identical with the 
x{i) process but independent of it. Then if a:^(0 is defined by 

xtd) = n‘ , 

we identify, m accordance with our heuristic principle the process with variables 

{xit) - rh*{i)] 

with the one with variables 


iaim(0 - (^J''^n(0}. 
Doing this leads to the fact that the distribution of 


(A) 


1/2 


L.U.B. a;,„(0 - hY'c 
\m + n/ o£(^i \nj 





Max !*(<) - 

OrS (gj 


converges to that of 



KOLMOGOROV-SMIKNOV THliOREMS 


397 


Now the x{t) process and the process with variables 

j x(t) - 

I (1 + / 

are stocliastically identical. Hence we are led to the conclusion that the distri¬ 
bution of converges to that of H, and this is Smirnov’s theorem, 

stated above (The method we use does not seem applicable to Smirnov’s deeper 
theorems on the number of intersections between empirical and true distribution 
curves or between pairs of empirical distribution curves.) 

APPENDIX 

4. The Brownian movement process. Consider any Gaussian stochastic 
process, with random variables where t varies in some interval. That 

is, we assume that for each i in the interval x(t) is a random variable and that 
for any j > 1 if < ■ ■ ■ < are m the interval the j variate distribution of 
x{ii), ■ , x{tj) is Gaussian. In the following we shall always assume that 

E{x(t)] = 0. Then the process is determined stochastically by the covariance 
function 

r(s, t) = E{a:(s)a:(i)l. 

In particular, if the range of parameter is the interval [0, «>) and if 

r(s, 0 = v* Min (s, t), 0 < s, i < w, 

the process is called the Brownian movement process, or sometimes the Wiener 
process; v is a positive constant. When considering this process we shall write 
f(<) instead of x{i). For the f(i) process 

Pr{K0) = oj = 1 , 

Emt) - f(s)f( = 1 t - S 1, 

and if 0 < Si < < S 2 < <2 the increments x{ti) — x(si) and x(( 2 ) — x(s 2 ) are 

mutually independent. We shall use the following properties of this process, of 
which the first two are well known. 

(a) The sample functions are everywhere continuous with probability 1. In 
the following we can therefore write as if all the sample curves were continuous. 

(b) For fixed s 

(4.1) Pri Max [f(s -t- 0 ~ t(s)] > M = 2Pr(i-(s -f T) - t(s) > X).^ 

0< t£T 

(Note that the use of a general initial value s, rather than 0, has not added 
to the generality and we drop this affectation below) 

(c) If a > 0, h > 0, a > 0, /3 > 0, then 

(4.2) Fr{ L.U.B. [C(0 - (ai + b)] > 0} = 

os i<« 

’ Due to Baohelier, cf the proof by P L^vy, Comp Malh ,Vol 7 (1939), p 293 Oneway 
to prove (a) is to prove (4 1) first, with L.U B instead of Max, and then use it to calculate 
the probabilities relevant to (a) 



398 


J. L. DOOB 


(4.3) PrjL.U.B. [f(i) - (at + 5)] > 0 or G.L.B. [f(«) + + /3] < 0), 

0^ «<P0 


_ 'y ^ 1^—2[m*a6+Cm-“l)*afl+m(m*-l}Cfffl+o&)l 

m"! 

_|_^-aUtn — l)*ab + + m(w — l)(o^ + nbJl 

+ a/!) + m{m — l)«jJ + m(m + Dab] 

*“6 

—2(m*(a6 + a/S) + w(m + l)ap + mtm — l)abl >, 

—e h 


in particular (a = a, (3 = 6) 


PWL.TJ.B. 

I os «« at 0 




'The probability in (4.2) is the probability that a ^(i) sample curye will ever 
reach the line with slope a and ordinate intercept b; the probability in (4.3) is 
the probability that a sample curve will ever reach either of the indicated 
halflines, one above and one below the t axis. Since the right hand sides are 
continuous functions of a, h, a, /3 we could write >0 instead of >0 and <0 
instead of <0 on the left, so that these probabilities are also the probabilities 
that a sample curve will ever rise above the indicated line or leave the indicated 
angle. 

It will be convenient to describe a line by its slope and ordinate intercept; 
the line [u, v] is the line with slope u and ordinate intercept v. We shall take 
(T = 1 in the proof; this is no essential restriction since is the random 

variable of a process of the same type whose o- is 1. 

To prove (4.2) let h) be the probability on the left, the probability that a 
sample curve will reach the line [o, 6]. If b = 6i + b 2 , b, > 0, a sample curve 
which is to reach [a, b] must first reach [a, bj and then move up to meet a line 
with slope a, hi units above the first meeting with [a, bj. Then 

viO', bi + hi) = <p(a, hi) <p{a, h). 

Now^(a, b) > Pr(f(l) > a + b} >0 and^(a, b) is monotone non-increasing in b, 
for fixed o. The only solution of the functional equation with these properties is 

¥>(a, b) = 

Now (p(a, b) is the probability of reaching [0, b] at some first time s and then 
going on to the line [a, b] which from the vantage point of the first common point 
(s, f (s)) is the line [a, os]. In other words, usmg (4.1) 





d, Pr{ Max f(/) 


> b] 



g-^(o)a« 


(2ir)i'i*s'/» 


ds 



KOLMOGOROV-SMIHNOV THEOREMS 


399 



“ ® i 


from Avhich it follows that ^(a) = 2a, and this yields (4.2). 

To prove (4.3) we consider first the followmg general problem: Let [m, vi], 
[wa ,Vi], • ■ , Uj > 0, vy > 0 be a sequence of lines; let i = ii be the first value of t, 
if any, at which a sample curve meets [iti, »i]; if h is defined for a sample curve 
let <2 be the first value of i > h , if any at which the curve meets [—th,, —fa]; if 
^2 IS defined for a sample curve, let 4 be the first value of f > < 2 , if any, at which 
the curve meets [uz, Vi], and so on. Let 7r„ be the probability that there is a point 
t„ , in other words the probability that a sample curve meets the lines [ui, vi], 
[—Ui, — to], • ■ ■ [(—l)'“’*'^u„ , (—l)”’''‘a„] in at least n successive points. We write 

TTn = 7rn(«l , «! , • • • , Mn , «»)• 

In particular, according to (4.2) 

(4.4) , V,) = 

To evaluate 7r„ , let Q be the point (<„_i, f (fn-i)) on the sample curve, and 
suppose for definiteness that n is even. Starting at Q, if there is a , the curve 
must finally reach [—Wn, — d„], that is it must go to a line of slope —Un, which is 
+ D„_i + UnU-i + Vn units Vertically below its initial position Q when 
t = i„_i. According to (4.2) the probability of doing this is 

2UnC«iiwl 1+*»—(n— 

Now we replace the line [—, — r„] by a line which depends on <„_i but which 
leaves this probability unchanged; the new line has slope — (w„_i + Un) and is 

~ Z Ii (Wn-l tn-l + Wn-l + 'Mn tn-l + ^n) 

units below Q when t = . Finally we reflect this new line in the line parallel 

to the t axis through Q. These two changes do not affect the probability we are 
discussing because the changes of f(0 after are independent of the changes 
before and have symmetric distributions. The final line has slope Mb-i + Wb-i 
and is h units above Q when t = <„_i; it is the line 

'^n—1 "b UnVn "b 1 *1 

W»-l + Un J 

which does not depend on <„_i. This line lies above [wn-i, z'n-i] in the first quad¬ 
rant, so that if a sample curve reaches it the curve must also intersect [wn-i , yn-i]. 
We have thus proved that 

(4.5) TTniUi , Wi ; ■ ■ • ; «n , I'n) 


-f- Un , 



400 


J. L. DOOB 


The fundamenlal identity (4.5) makes it possible to reduce the evaluation of 
TTn to TTi in n — 1 steps, in is evaluated in (4.4). Thus successive meetings with n 
lines have been reduced to a meeting with a single line. As a first e,xaraple suppose 


Then we have 


so that 


Ul = • • • = Un = U, Vi =•■■=> v„ = V. 

Tr^(u, V, -.UfV) = irn-i{u, V, ; 2m, 2v) = 
= in(nu, no), 


(4,6) 7r„(M, v; • ■ • ; M, «) = 

More generally suppose 

Ui = Ui =•'•= a, Vi = Vi h, 

Ui = U 4 =■■■= a, «2 = V4 = * ■ • = /3. 

Then we show that for suitably chosen (7y“^’s we have according as n is even 
or odd 

^ Ol%b + 

ir„(a, 6; • • • ;«, (3) = TTi ^ (a + a), --i---!—5- ; 

2 + “) 

^n(O) b", • • • ] a, h) 


~n + 1 .n- I Ci”^ah + Ct^cxp + 

= •>ri —K— a + —s— a.-— -—-. 

n+l,7i+loi 
—2— 2— 

For M = 1 this form is correct with 

Ci'’ = 1. = 0 

If now n is even and if the equations are true for n, 

T„.,-i(a, hr--,a,b) = irJa,b-,^ (a + a), 

\ / 

= / ^ + ^ a + - a a?) + + Ci’'^ab + + n{a + a)b \ 

\ -^“+2“ / 

and comparing this with (4.7) we find that 

(7{"+« = (7i”> + M + 1. 



KOLMOGOROV-SMIRNOV THEOREMS 


401 


^(n+1) _ 

= Ci"' + n. 

If n is odd we find similarly that 

^(-+1) ^ ^(n) ^ 
^(n+l) _ 


' = Ci 


(p) 


^(n+l) 

— -t" ^2. -f- 1. 

The solution of these equations la 


n even 

n odd 

2 

jy n) ^ 

Gi - - 

(n + 1) 
4 

fyM _ ^ 

c. -- 

qM _ (n “ 1) 

_ n(n - 2) 

4 

fiW _ n — 1 

nM n(n + 2) 

4 

i-H 

1 

11 


Then 

(4.8) 


_ —ilTj^ab + n2a^ + n(n — 2)a/3 4- n(n + 2)ccb] 

TTn — e 

^-i[(n + l)*ab + <n - l)^ap + (n* - l)o^ + (^2 - l)a5J 
TTn = 6 


We can now prove (4 3). In fact the left side is equal to 

iri(a, b) + 7ri(a, /3) — ir^ia, b- a, 0) — iT 2 {a, a, b) + 


{n even). 


(n even), 
(n odd). 


which gives (4.3), on substituting (4 8) Only (4.3'), which follows from the 
simple (4.6), is used in the application to the Kolmogorov-Smirnov theorems. 


6. Transformations of Gaussian processes to the Brownian movement process. 
The f(t) process studied in section 4 is so simple that it is important to be able to 
reduce others to it by elementary changes of variable For example if the co- 
variance function of a Gaussian process has the form 

(5.1) r(s, t) = u{s)v{t), s < t, 

for s, t in some interval, and if the ratio 

u(t) 



402 


J. L. DOOB 


is continuous and monotone increasing, with inverse function ai(0. We define 

_ u[ai(i:)] 
t>[ai«)] ■ 

With this definition the f process is Gaussian and since if s < i 

the f process is the Brownian movement process with o- = 1. This transformation 
from the x to the f process is effected by a combination of a change of variable 
in t and the application of a variable scaling factor. (Conversely, if such a trans¬ 
formation is applied to the Brownian movement process it is trivial to verify 
that the new covariance function will have the form (5.1). The Gaussian processes 
with covariance functions of this form are easily seen to be the Gaussian Markov 
processes) 

6. The Gaussian process with r(s, t) = s(l — t). In section 3 the Kolmogorov- 
Smimov theorems were reduced to properties of a Gaussian process with param¬ 
eter t,0< t < 1, for which 

P7-{x(0) = 0} = 1; 

P{®«)} - 0; 

E{[x{t) - a:(s)fl = (« - s)[l - (< - s)], 0 < s < f < 1. 

Now these equations imply that 

E{xitf} = f(l - 0, ®(a:(s)*} = 8(1 - s), 

and combining the set we find that 

r{s, t) = £;{x(s)x(t)} = s(l — i)i 0 < 8 < < < 1. 

This covariance function has the form studied in section 5, and using the trans¬ 
formation of that section 

f(0 = {i + l)x . 0 < < < = 0 , 

defines a Brownian movement process (with v = 1). Then if D, D~ are 
defined as in section 3, we have from (4.3') 

Pr{D > X} = Prlh, U. B. > x) = Z (-1)"+' 

( ogK* f + 1 J 1 

Pr(i)+ > X) = Pr{D- > X) = 


and from (4.2) 



KOLMOGOROV-SMIBNOV THEOREMS 


403 


This proves (3.2') and (3.3'). Note that we could go beyond these results, because 
of our detailed knowledge of the x{t) process. For example we can evaluate 


lim Pr{(n)*Z)); < Xi, (n)*/)^ < Xa}. 

n-^80 

If Xi = Xa = X the probability is the probability that < X which we have 

already treated. In general it is, in the limit, 


PrjMin x{t) ^ — Xi, Max x(J) < Xa} 

OgISl OSiSl 


= Pr 


G.L.B. 


Ut) > 

f + 1 ~ 


Xi L.U.B. 

I+ 


1 



ao 

1 v f -2[max2+(m-l>2^*+2Mi<m-l)XiX2l , -2[(m-l)“X*+m*X*+2mCm-l)XiX2] 

= i — ^ [6 2 I ~V ^ a 1 


“2ltrt2 (X2'|-X®)-Hrt(ni—l)XiX2-|-ffi(w»+l)XiXjI —2 [wi* (X^+X2)-|-m(m+l)XiX2H-wt(m—1)X jX j] 

e i 3 — 6 12 


j ^ jg-2[mX2+Cm-l)Xi]2 ^ g-2[(m-l)X2+mXi]2 _ } 


obtained^by setting a = h = Xj, a = |3 = Xi in (4.3). 



PEARSONIAN CORRELATION COEFFICIENTS ASSOCIATED WITH 
LEAST SQUARES THEORY 

By Paul S. Dwykr 

University of Michigan 

1. Introduction and summary. It is 'well known that the-zero-order correlation 
between the predicted value of a variable and the observed value of the variable 
is the multiple correlation. It is also well known that the zero-order correlation 
between the residuals for two different variables, when the prediction is from a 
common set of variables, is the partial correlation. These considerations naturally 
lead to a systematic investigation of all the zero-order correlations involving 
the various variables associated with least squares theoiy. Such an investigation 
is the purpose of this paper. 

As a result of this study it appears that other zero-order correlations include 
the multiple alienation coefRcieni, the part correlation coefficient, and certain 
other coefficients which, as far as I am aware, have not been previously defined. 

The paper first examines the case of a single predicted variable and then 
continues with the case in which two or more variables are predicted simultane¬ 
ously. The paper includes (1) a theoretical development of the different coeffi¬ 
cients and the relations between them, (2) the expression of the formulas in 
determmantal form, (3) a matrix presentation of the material, and (4) an outline 
of the calculational techniques—with illustrations. 

It should be made clear at the start that this paper deals with populations 
(finite or infinite) and not with samples from those populations. The sampling 
distribution of each of the new correlation coefficients defined in this paper 
might well become the subject of a later investigation, but first W'e need to 
know what these correlation coefficients am. 

2, The case of the single predicted variable. Notation, definitions, and basic 
properties. We suppose that a pop.ulation consists of N individuals with values 

, A's,, • ■ , Za., , Yj lor the variables Zi, Z 2 , ■ ■ ■ , Za , 7 and that Y is 
linearly predicted from the Z, by the formula 

(1) E=Y -ao~ aiZi - ajZj-a^Z* =7-7 

by least squares theory. For the purposes of this paper, we use a concise surama- 

tion notation, SQ, in place of the more formal serial notation S Qi which is 

i-i 

6 

preferable to the frequency notation Qxfz and, in the continuous case, 

b 

/ Qxfxdx Moreover it is desirable that the scales of Z and 7 be chosen so as 


404 




1’barsonian cobrelation coefficient 


405 


to facilitate the easy determination of the various formulas. If we let 

7,-7 


( 2 ) 


y, 


VN<r^ ’ 


X, - X . 

” ViVa.. 


wehaveSaiJ = = 1 with the resulting coirelating formula 


(3) 


Pill/ — 


2a;, 1 


V( 2 x?)( 2 i/) 


= Sa;,y and px,x^ = Ix^Xj. 


The transformations (2) when applied to (1) give 


(4) e = = y - (fiiXi + l3iX2 + • •• + ft a:*,) = y - y 


where the /S’s are standard regression coefficients and e is defined to be 


E 

■\/Eay ' 


It is to be noted that the values of y, e, and y are all dimensionless. 

The values we wish to correlate are those of Xj, 7, Y of (1). The zero-order 
correlations involving these are the same as for y, e, y of (4). 


3. Correlations with a single predicted variable. We wish to minimize 
2e“. Differentiating with respect to j3, and equating to zero we get 

(5) 2ex, = 0 

from which by multiplication by and summation for i, 

(6) 2e^ = 0. 

It follows that 


(7) 2e® = 2e(y - y) = '2ey = 'Z{y - y)y = 2?/^ - 2i/y = 1 - 'Syy 


Using (4) and (7), we get 


= 1 - 2(e + = 1 - 2y^ 


so that 
( 8 ) 


= I? = 4 = 1 - V 

A'0*7 <Ty 




•> 9 

(Tr ffjf 

2 

ffy 


This is the conventional definition (from least squares theory) of the multiple 
correlation coefficient, so 

(9) Pl/;*i*3' 'Xk — Pylx) — ~ 2i/7/. 

Application of (9) to (7) gives 



406 


PAUL S. DWTEB 


where Kf/(x) is the multiple alienation coeflScient. We now have Zxl = 1, Sj/* = 1, 
Se® = k2(i), and 2^“ = p^iz), so that we are able to present formulas involving 
y> y- We first form the cross products 

( 11 ) Xxy = pxu, 

(12) Sxe = 0, 

(13) 'Zxy = Hxiy + e) = 'Lxy =■ 

(14) Sye = Sy(y - y) = Sy® — 2yy = 1 - 2y“ = 4(r), 

(15) 2yy = 2y^ = 

(16) l^ey = 0. 

We then have 


(17) 

(18) 

(19) 

( 20 ) 


Pr.*, 


Piitf 


pxe 


2XiX, 

2x,y 

2ie 

\/(2)x*)(2e‘) 


= Sx.x^. 
= 2x,y, 
= 0 , 


_ 2x^ _ ^ ^ 

V (2x“)(2?/®) Pv(*) Pi/(*) 


It is interesting to note that this is unity in case k = I for then piy = pv(,). Other¬ 
wise the absolute value of p^y is larger than that of pzy For this reason this 
coefficient might be called the multiple augmented correlation coefficient. 


( 21 ) 


_ Sey _ ^ 
V {'Ze^){Sy^) Kyi,x) 


fiyM • 


Thus the correlation between y and its residual is the multiple alienation coeffi¬ 
cient. 


( 22 ) 


“ V(2y’)(2y») "" ‘ 


Thus, as is well known, the zero-order correlation between observed and pre¬ 
dicted y is the multiple correlation. 


(23) 


_ Zey 
“ V(2e5)(2y2) 


= 0 . 


4. Notation for the general case. We need to extend the notation and the 
definitions before examining explicit formulas for the more general case of two 
(or more) predicted variables. Suppose that F, and Y, are the two variables 



PBABSONIAN GORHELATION COEFFICIENT 


407 


predicted from the same X’a. Then from (4) we write 

E, 


(24) 


e, = 


■\/N<rr, 
E, 


— 2/» ” ^ik^k — 2/i yt 


" l^ikXk — yj yj • 

We then have the two sets of normal equations 
(25) 2e,a: = 0 Se^a: = 0 

so that 

2e,^, = 0 Scji/, = 0 

= 0 HSj-yj = 0. 

It follows that 

Se,e, = - y,) = = S(y. - ^.)2/^ = - ^y&i 

(27) 

= ^yiU} ~ = Sj/i?/,' - = p.-y - liyiy, 

if we use the notation that p,y = p^.y^. 


(26) 


6. The correlations involving more than one predicted variable. In this case 
the y% the e’s and the y’a (as well as the x’b) can have niore than one variable 
so that the correlation coefficients we need, in addition to those of section 3, are 
Pi/iuj) PiliVi} P»(«i) PviVji PvtVj} P«ivy> ^'^d Py.e,. Wb need now only the 
summed products 


(28) 

(29) 

(30) 

( 31 ) 


^yiyj — Py,vj — Pii) 

2e,e, = p., - 2^,-^,- as given in (27), 

22/^6, = 22/,(y, - y,) = 2 ^y,yj - 2 y,-^, = p„ - 'Lyq„ 
Xyiy, = Zy,yj, 

2e,i/j = 0. 


We have then 
(32) 


_ '^ViVi 

V(2j/J)(22/?) 


2y,y,, 


(33) 


_ 2e,e, _ Pi, - 2y,yj 

~ V'(Ze^)(Ie?) " 


This is the partial correlation coefficient. 


(34) 


Pm, — 


V(Sj/?)(z^?) 


P,(x)P,(*) 




408 


PAUI. E5. DWYER 


Tb.s coefficienl appears to be new. Since it is the correlation of predicted values, 
I .suggest that it be called the predictions correlation coefficient 


(35) 

2?/. Cj 

P.j - 

" ViiyU^^ef) 

K,^x) 

(36) 

2e.y, 

P„ - ^ViV, 


Si(x) 


The correlations given by (35) and (30) have been defined previously and are 
known as part correlation coefficients [1; 213,497]. 


(37) 




Pi lx) 

(38) 


^yxVi 


Pllx) 


The correlations of (37) and (38) appear to be new. Each is, in a sense, a generali¬ 
zation of the multiple correlation coefficient since it becomes the multiple cor¬ 
relation coefficient when i = j 1 suggest that it might be called the cross multiple 
correlation coefficient, since it correlates the actual value of one variable with 
the predicted value of another. 


(39) 


Pt,v, 




_ n 

VdlelKEyf) ’ 

_ n 

V(Eyl)C£e^,) 


A summary of definitions and names of Pearsonian correlation coefficients asso¬ 
ciated with least squares theory is presented in Table I. No name is proposed 
when the coefficient is identically zero. 


6. Relations between the correlations. Many relations exist between the 
correlations defined in earlier sections. Some of the more interesting of these 
are obtained by the elimination of from formulas involving this term.Thus 
from (34), (37), and (38) we get 

~ Pti,v,PUx)PHx) = Puii/jPUx) = PuiviPiMi 

and from (33), (36), and (30) we get 




We then have 
(40) 


Pij “ Py\Vj PiW Pj(i) 
Pt) PviV, Pjlx) 

P*I PlIiV, P«<l) 


> = < 


P»,aj 

Pv,ej 


where the six members may be equated in all possible ways. 



PEAUaONIAN COJlUELATIOiSr COEFFICIENT 


409 


Interesting and simple relations can also be obtained by formation of ratios. 


Thus 

(41j 

P«i«) 

Pyte, 

1 

«!(*) 

so^ 



ej 

1 

P«n/J 

Kj‘{xy 


P«»I/J 

1^1 (x) 




TABLE I 


Definitiou 

Name 

Single predicted variable 

P*i*j 

Correlation coefficient of zero order 

Pxy 

Correlation coefficient of zero order 

pxe 0 

None 

Pzy 

pxy — 

*Multiple augmented correlation coefficient 

pyx 


II 

Q. 

Multiple alienation coefficient 

Pyy — Py(x) 

Multiple correlation coefficient 

pey = 0 

None 

Two or more predicted variables 

PVxVl 

Correlation coefficient of zero order 

Pxi «, 

Partial correlation coefficient 

Pi/.i/j 

*Predictions correlation coefficient 

Pmi «) 

Part correlation coefficient 

PUiV, 

*Cross multiple correlation coefficient 

P «iK> 

None 


* Proposed name 


Similarly 

(42) 


PuxVj _ P»Cj) 
Py\Vi Pj{x) 


The geometric mean of similar coefficients yields such expressions as 

V P oitfj “ P«l«/ V 

Vpi/.i/ , PViy, — PtiV} Vp »(s) PjCi) 


(43) 


7. Determinantal formulas. The implicit normal equations (5) become when 
expanded 

Pll^l + P12/32 + • • • + PuPk = Ply 
(44) P21|3i + P2502 + • + pik^l. — Ply 


PAlft + PI 2^2 + • — Pllfil. = Ply 


I 





410 


PAUL S. DWYER 


while = p2(i) becomes 


(45) 

PvA d" Pi/2^2 + '•• + Pi/A = Pvc*)- 


Let A be the determinant of the matrix of the solution of the k a;'s and y. Let A' 
be the corresponding determinant with p^y replaced by pJdj. Let A„ be the 
determinant of the correlation-matrix of the k ic’s. Then pj(i) = = Syy can 

be expressed as a function of A and Ayy If (44) and (45) are to hold simultane¬ 
ously, then A' = 0. Expanding A' in terms of the bottom row, we get 

(46) 

A' = 0 = plcx)Ayy + “terms”. 


Similarly 

(47) 

A = PyyAyy -f- “tCrmS” 


where the 
A= (1- 

“terms'" of (46) and (47) are identical. It follows by subtraction that 
pl{x))Ayy and hence that 

(48) 

Sy^ = 2y* = pJc.) = l-A. 

^in/ 


Then 



(49) 

Se* = ley = = 1 - Z/ = 1 - (l - A\ 

\ ^w/ 

A 

Ayy' 

Correlation formulas of section 3 then appear as 


(50) 

Pxv 


(51) 



(52) 

Py. = /|/l - A. 



In a similar way the normal equations (25) become two sets of normal equations. 
The first set is lilce (44) with /3. replaced by and py. replaced by The 
second set is similar with i replaced by j. It is desired to find 

( 23 ) ^yiVi = ^yiyj = PuiiPi + + • • • + Pv/k^k- 

Now using (53) with (51) as applied to y, and using the technique of the first 
part of this section, we get 

(54) ^vtvj ~ Puivj\,vt i/ji/y "t~ "Terms”, 

(55) 0 = '^yiyAv.vivivi + “terms”, 

where A is the determinant of the matrix of the correlations of the h x’s, y, and 



PEAHSONIAN CORRELATION COEFFICIENT 


411 


2/j'j ^y,v, is the determinant obtained by deleting the column involving correla¬ 
tions of 2 /, and the row involving correlations of yi] A„,„, is the determinant 
of the matrix of the k a;'s; and the "terms” in (54) and (55) are identical. It 
follows that 

(56) = 
and thence 

(57) p„ - 

The formulas of section (5) then appear in determinant form as follows 


(58) 


as is well known. 



Formulas for p«.„, and p„.„, are similar to (60) and (61). 

Modem methods of calculating determinants (2), (3), (4), (5) are advised if 
calculations are to be made from those formulas. 

8. Matrix formulas. A matrix presentation is very useful in exhibiting the 
general features of this theory and in developing compact and easy methods 
of calculation with finite populations. The matrix presentation here is similar 
to that given by the author in a previous article [6]. 

Let the nomal equations (24) be represented by the matrix equation. 



( 62 ) 


E=Y-XB=Y-Y. 



412 


I’AUL tj. nWYKU 


Then the sets of normal equations l)ecojnc 

X'E = 0 or X'iY - Xli) = 0 


so that 

(63) X'XB = XT. 

Now since XB = Y, (63) can be written a.s X'Y ~ X'Y and it can bo shown 
that 

(04) Y'Y = FT = FT. 


But under the assumptions of section 2, X'X is the matrix of the intercorrela¬ 
tions of the N's, X'Y is the matrix of the intercorrolations of the a:’s and y'e 
and FT is the matrix of the intcrcorrelations of the j/’s. Hence (03) can be 


written 


(65) 

RxxB = Rxy 

so that 


(66) 

B = R7xRxv 


If F is composed of a single variable, B is a single column matrix (vector) 
but if F is composed of m variables, B is an m column matrix. It follows at 
once that 


(67) IT = IT = B'X'XB = B'R,,B - “ R'^RTIR^ 

and that 


( 68 ) 


E'E = (F - XBYE = Y'E = F'(F - XB) = Y'Y - Y'Y 


= FT - IT 


Ryu RxyRxx Rxu< 


It thus appears that the matrix (67) has diagonal terms = Hyy which are 
the squares of the multiple correlation coefficients, and that the non-diagonal 
terms are = 'Lyi'yj. Simdarly the matrix (68) has diagonal terms = 
k2(i) and non-diagonal terms 2e,ey = 2e,■?/,•. It follows that all the correlation 
coefficients defined above may be calculated from the matrices 72**, 72*„, Ryy, 
IT, and E'E, The matrix (67) might bo called the multiple correlation matrix 
and the matrix (68) the multiple alienation matrix. 

Conventional results are expressed m terms of the correlation matrices 72**, 
Rxy, and Ryy. All the correlation coefficients defined in this paper may be ex¬ 
pressed in terms of these matrices and the multiple correlation and alienation 
matrices. 


9, Calculational method of determining the multiple correlation and multiple 
alienation matrices. Various methods might be used in calculating the multiple 
correlation and alienation matrices from the correlation matrices. One method 
utilizes the square root method of solving simultaneous equations, which has 



pevrsoni^n coerelvtion cobppicient 


413 


recently been presented in a nunaber of places, [7] [8] together with a device 
which IS similar to that used by Aitken [9] m eliminating the back solution This 
method solves the equation (65) by forming the auxiliary 

(69) )SixB = SixRZxRxu 
where Sxx is a triangular matrix such that 

(70) ffixx - SLSxx = 0. 


TABLE II 


General 

Illustration 


Ryy 





1.000 

.495 







; — 

1.000 



1.000 

.652 

.554 

.615 

.313 

.650 



— 

1.000 

.747 

.693 

.280 

.803 

Rxv 

Rxy 

— 

— 


1 .774 

.182 

.804 



— 

— 

j 


.166 

.812 



1.000 

.652 

.654 

.615 

.313 

.650 




.758 


.385 

.100 

.500 

Sxx 

SxxRxz Rxy 



.659 

.360 

.064 

.287 






.586 

.072 

.199 






1 

.117 

.221 


Y'Y 





— 

.794 







.883 

.274 


E'E 





— 



The right hand side of (69), when premultiplied by its transpose yields 


(71) {SxxRx^RxyyiSxxRxxRxy) = R'xyRxx S'xxSxMTxRzy = R'xyRxxRxy = Y'Y- 


Speaking less technically it is only necessary to multiply the columns of 
SxzRxzRxy to get Y'Y- 

A first illustration utilizes the correlations of the Carver anthropometric 
data [10] for 1000 University of Michigan freshmen. This group may be regarded 
as constituting a population, or it may be regarded as a random sample of a 
larger population. For present purposes we regard it as a population. Height 
(Fi) and weight (F 2 ) are estimated from shoulder girth (Xi) chest girth (Z 2 ), 
waist girth (Xg), and right thigh girth {X\) The calculation of Y'Y and E'E 
from the correlation matrices follow 























414 


PAUL S, DWYEE 


As a second illustration I use the correlation between the parts of two forms 
of the Thorndike Intelligence Examination which Lorge has used in illustration 
canonical correlation technique [11, 69-74]. The X’s are the scores on the three 
parts of FormtA and the Y’b are the scores on the three parts of Form B. In 
this case we designate the results by r’s and fc’s (rather than p’s and k’s) since 
the calculation is considered to be for a sample. The calculation of the sample 
multiple correlation and^multiple alienation matrices is presented in Table III. 


TABLE III 



Form A 

Form B 



■ Xi 

Xa 

Xi 

yi 

yt 

y3 






1.0000 

.8235 

■ 









Rwi 






_ 




1.0000 


.7852 

.8986 

.7841 

.8217 


R,x 

— 


.8393 

.7961 

.8543 

8254 

Rxu 


— 

— 


.7683 

.8226 

.8588 


1.0000 


.78521 

.8986 

.7841 

.8217 


s„ 



.3609 

.1487 

.3864 

2920 





.5032 

.0180 

1 

.1341 

.2146 






.8299 


.7858 






— 


.7861 

X'Y 





— 

HI 

.8069 






.1701 

.0590 

.0054 






— 

.2179 

.0454 

E'E 





— 

— 

.1991 



10. The numerical values of the coefficients. "J'he diagonal entries of the 
multiple correlation matrix give the values of Zy] = Sj/.j/. = pl^ while the 
non-diagonal values are Zy,yj = The diagonal entries of the multiple 

alienation matrix are 2e< = while the non-diagonal entries are 

Sejc,- = 2e<j/, = Zyiej. We are then able to write out any of the correlations 
easily. Thus from Table II 

Piw = = VJri = .342, 

P2(x) = = VTm = .891, 

Kiw = \/Zel = \/.883 = .940, 































PEAESONIAN CORBELATION COEFFICIENT 


415 


= 

V^el = 

Vsoe • 

= .454, 

= 

Z6i 62 


.274 

V{M)(M) V(.883)(.206) 


^yiVa 


.221 

pyiyt = 

V(S^i)(2^i V(.117)(.794) 


2)Gi6a 

.274 

= .604, 

P«lfi5 = 




26102 

.274 

= .291, 

Pl/2«1 ~ 


•v/;^ 

Pvil/2 “ 

Syiyi 

V^l “ 

.221 

= .248, 


^yiVi 

.221 

= .646. 

PViVi ~ 


VIT? 


TABLE IVa 


General 

Illustration 

Pl(*) 

^1/1 w 

^VlVi 1 


.9110 

.8299 

.9489 

.83921.8644 

.7646 

.9603 

.8626 |'.8747 
,7858 


^ 2 (») 

S2/2 

ry2yi 

^»jvi 1 I'tfjra 
2j/5^a 


.8844 

.7821 

.9917 

.8889 1 .8751 
.7861 

1 


raw 

2^3 



.8983 

.8069 


TABLE IVb 


General 

Illustration 


r.. 

T,l€, 


.3066 

.0298 

/Cl(a) 

T «,,/j 1 Tin «a 

^ «ll/l 1 «» 

.4124 

.1431 1 .1264 

.0131 1 .0123 

Se5 

26162 

26,6, 

.1701 

.0590 

.0054 



^ «2 «J 



.2214 


^ 2 ( 1 ) 

^ «i»j 1 «» 


.4668 

.0973 1 .1033 


262 

26263 


.2179 

.0454 



^!(*) 



.4394 

i 


2eS 

1 


.1931 















416 


PAUL S, DWYER 


It IS possible to utilize a scheme of successive divisiou if all these correlations 
are desired when there are more than two predicted variables. By divisions wc 
compute in turn and p„,v, from the multiple correlation matrix 

and K,'(rt) p,,y,, P«io, from the multiple alienation matrix for each i, j. The 
computational scheme is illustrated in Table IV where the correlations used 
are the sample correlations of Table III. The calculations from the multiple 
correlation matrix arc presented in Table IVa and those from the multiple 
alienation matrix in Table IVb. 

In Table IVa the multiple correlation matrix is first entered on the third of 
each three lines The square root of each diagonal term is then extracted to give 
the multiple correlation coefficients The value of r,(i) is then locked in the 
machine as a divisor and it is divided, in turn, into Sj/ij/a to get and 
Tyjuj. Then rjci) is used as a divisor by division into to get into hyiy^ 
to get and into to get Finally r^x) is divided into r^u,, to get 
Tyiv„ into Si/ij/a to get into to get ry,y, and into to get Vy^y,. A 
check on these divisions can be made, if desired, by dividing ry^y^ by Tk*) to get 
Tim, by _ri(*) to get ry,y,, and ry,y, by ri(,> to get 

Table IVb is treated in a similar manner. 

This technique is immediately applicable to the case of many predicted 
variables, 


REFERENCES 

11] M. Ezekiel, Methods of Correhhon Analysts, Second Edition, Wiley, 1942. 

[2] A, 0. Aitken, "On the evaluation of determiimiits, the formation of their adjugates, 

etc.,” Edinb. Math. Soc. Proc., Senes 2, Vol 3(1933), pp 2()7-219. 

[3] P. S. Dwter, "The evaluation of determinants,” Psychomciriica, Vol. 6(1941), pp 191- 

204. 

[4] A. 0. Aitken, Delemmants and Matrices, Second Edition, Oliver and Boyd, Edin¬ 

burgh, 1942. 

[6] F. V Waugh and P. S. Dwter, "Compact computation of the inverse of a matrix,” 
Annals of Math, Stal , Vol. 16(1946), pp, 359-371 

[6] P. S. Dwter, "A matrix presentation of least squares and correlation theory, etc.,” 

Annals of Math. Slat., Vol 16(1944), pp. 82-89, 

[7] F. S. Dwter, "The square root method and its u.se in correlation and regression," 

Am. Slat. Assn Jour , Vol. 40(1946), pp, 493-503. 

[8] D. B. Duncan and J. F. Kennet, On the Solution of Normal Equations and Related 

Topics, Edwards Brothers, Ann Arbor, 1940. 

|9] A. 0, Aitken, “The evaluation of a certain triple product matrix," Roy Soc, of Edinb. 
Proc., Vol. 57(1937), pp, 172-181. 

[10] II. C Carver, Anthropometric Data, Edwards Brothers, Jinn Arbor, 1941 
[XI] Irving Lorgb, "The computation of Ilotelling oanonical correlations,” ProoeedtTH/s of 
Educational Research Forum, Endioott, N, Y , Aug, 26-31,1940, pp. 68-74. 



INVERSION FORMULAS IN NORMAL VARIABLE MAPPING 

By John Riordan 
Bell Telephone Laboralones, New York 

1. Summaiy. The two inversion formulas considered here arise from study of 
G. A. Campbell’s work on the Poisson summation, which is described more fully 
in the introduction and in the main consists of finding a function or mapping of 
a variable connected with the summation in terms of a normal (Gaussian) 
variable g. More generally, this last is a process often called “normalization of 
the variable” and associated with the names of E. A. Cornish and R. A. Fisher. 
The mapping is two-way and the main inversion formula determines co-efficients 
for one way from those for the other, both sets of coefficients being descriptive 
of their mappings More precisely if a: is a given variable, g a Gaussian variable, 
y a parameter of the mapping, and the two mappings are 

ae 

X = g + '£, G„{g) //nf, 

1 

ae 

g = X Xn{x)y’'/n\t 

the formula expresses Gn{x) in terms of X,(*), i < n, and vice versa. 

The second formula is more particularly related to the Poisson summation 
and relates coefficients p„ = pn{g) and = q„{g) in the pair of equations 

PC 

a = c £ 

0 

ae 

c = a 2 Pndr^''/n\ 

0 

Both formulas, which are necessarily elaborate, are given concise expression 
by the use of the multi-variable polynomials of E. T. Bell. 

2. Introduction. In 1923, in a paper little known in statistical circles, G. A. 
Campbell [2] gave as the basis for his extensive tabulation of the Poisson summa¬ 
tion an asymptotic series expressing the average o in terms of a normal variable 
g, corresponding to the probability of at least c occurrences, and c itself. That 
is to say, he associated with the Poisson summation 

00 

P(a, c) = ^ e~“aVa:l 

e 

a normal variable g, defined by 

^ 11 

417 


dx 



418 


JOHN HIOEDAN 


and inverted the summation (which, aa ia well known, is equivalent to the in¬ 
complete Gamma function ratio) to give a series for a in terms of g and c. The 
series, which is carried to 11 terms, starts as follows: 


a c 1 1 -f ffc -(- 


<7* ~ ^ g* ~ 


36 


+ 


If a: = (a — c) c is introduced, this becomes 


and X is seen to be, like g, a standardized variable of mean 0, variance 1. 

It seems to have gone unnoticed that this result includes the distribution 
through the transformation: 2a = X*, 2c = n and it has been rediscovered by 
A. M. Peiser [7] (4 terms) and by Goldberg and Levine [4] (6 terms). 

It is possible also to express c in terms of a and g, and a formula of this kind 
with fewer terms which appears in a footnote in Campbell’s paper is as fol¬ 
lows: 

- „ fi I g* "k 2 ^-1 , (7° -f- 2(7 _3/j 1 

cr^a\ \ - ga -f —g— a + a 

Finally there is a third possibility of expressing g in terms of the remaining 
variables, preferably x and c; though unnoticed by Campbell this has since been 
brought to prominence by Cornish and Fisher [3], Hotelling and Frnnkel [5] 
and Kendall [6]. 

The idea behind the first expansion appears most clearly in the second form 
and is that for c large the variable x behaves nearly like g. The third possibility 
reverses this expansion and gives a function of x and c which behaves like g\ 
hence if this function is first evaluated, reference to the normal integral table 
gives an immediate evaluation of the probabilities in question. Put in another 
way, the expansion widens the scope of the normal integral table and for this 
reason has been called “normalization” of the variable (but this term seems pre¬ 
empted by its use in another seqse for orthogonal functions, and has been re¬ 
placed in the title by normal variable mapping). 

From the point of view of statistical theory, the three expressions are different 
versions of one relationship, which suggests that there should be general rules 
for transforming a series of one type into that of another. The two inversion 
formulas given below supply these rules in what appears to be as compact a form 
as the problem allows. It will be noted that the proofs given suppose convergent 
series, a case which leads to clarity and brevity and is interesting an itself. Ap¬ 
plied to Campbell’s series, they give the known results so far as the latter go, 
but of course for other asymptotic series they need independent verifications. 


3. First Inversion Formula. This relate coeflficients in series like Campbell’s 
first and its reverse as in Cornish and Fisher. More precisely 



INVBESION rORMTJLAS 


419 


If diig)) Giig) ••• are assigned polynomials and if 

tc 

(1) a: = + 2 Gnig)y''/n\, 

defines x in terms of g and a parameter y, then 

(2) g = x + i XnW/nK 
where 

(3) -Xn{x) = YniaGiix), aGi(x), • • • , aGJx)), 

TABLE 1 

Bell Polynomials F„ {fgi ,fg 2 ’ •• fon) 

Fi = figi ^ 

F 2 = fiQi + /sffl 

F 3 = /ijfs + fi&giQi) + /sffi 

Yi = fig* + + fsi^gigi) + figi ^ 

Fc = /iff6 + fti^g^gi + lOffaffa) + ft (lOffsffi + 15<72 Pi) 

+ /4 (io< 72?J) + AsS 

F« = figt + fiiQgtgi + 15ff4?2 + lOffa) 

+ fiO-^gigl + OOgtQtgi + ISpI) 

+ fi(20g3gl + A&g\g\) + Ml^g^gt) + figi 
Yr = figi + fnilgtgi M- 21<76S2 + S^gtgi) 

+ /3(21j76£fi + I05gtgigi + 70gtgi + lOSgsS'a) 

+ fi{S5gigl + 210^3^2^1 + lOSjraffi) 

+ MSSg^gt + 105172V!) + M^igm) + fm 

Fs = figa + fti^gtgi + 2833^2 + ^Qgsgs + 35^4) ^ 

+ /3(28g6^! + IQSgsg^gi + 280^4^36'! + 210^4^2 + 280^3172) 

+ filfiGg^gl + 420gr4^2^! + 280^Vi “t" + 105^2) 

+ /6(70<74?! + dGQgtgig] + 420!7l!7!) 

+ M^Ggtgl + 210j72j7i) + f 7 ( 28 g 2 gl) + hgi 

F„ being the multivariable polynomial of E. T. Bell [1], in the variables Gi(x) to 
On{x) and the symbolic variable a which ts such that 

a* ^ ai = (-Dy~\ D = d/etc, 

with differentiations on all products of Gi(x) to Gn{x) associated with it in the poly¬ 
nomial. 

Note the symmetry of x and g, which allows the transformation to go either 
way, the inverse of (3) being 

(4) = YniaXiig), 0X2(9) ■ • • , aX^{g)) 

Table I gives explicit expressions for polynomials Fi to Fg. It will be noted 
that the number of terms in Fn is the number of partitions of n and that/,, the 



420 


JOHN BIOEDAN 


variable replacing o, in the table, is associated with terms corresponding to 
partitions with i parts; that is to say, if designates such terms 

1 

The verification or extension of the table may be accomplished by the formulas 
and relations given by Bell (l.c.) or more directly by those modifications of Bell 
given by myself in [8]. 

The first few instances of (3), dropping the common variable a: for brevity, 
may be read off from Table I (with appropriate changes of notation and inter¬ 
pretation of a,') as follows: 

-Xi = Gt 

-X, = Gi - D(Gl) 

~Xz - G, -SDiGtGi) -f DKG\) 

-Xi = Gi -4D(G,Gi) - 3D{Gl) + dD\G,Gl) - D\G{) 

Applied to Campbell’s first formula in its second form with y = and 
Gx(x) = (s’ - l)/3, Gzix) - (-6x* - 14x’ + 32)/270, 

05 ( 1 !) = (x’ - 7x)/18, Giix) = (Qx"* + 256x’ - 433x)/1680, 
these show e.g. 

x’ - 7x 2(x* - 1) 2x _ -7x’ + X 
18 3 ’ 3 18 ’ 

and similarly for the others, resulting in 

X, = -(x’ - l)/3 

Xi = (7x’ - x)/18 

Xz = -(219x* - 14x’ - 13)/270 

Xi = (3993x‘ - 152x’ + 119x)/1680 

These determine a calculation formula for the Poisson summation, which is 
a refinement of the normal approximation. That is to say 

P(a,c)=Hg) = 

with 

x^ — 1 7x^ - - X 219x^ — 14x^ — 13 

^ ® ~ ZVc 36c “ 1620cV7 

3993x® - 152x’ + 119x _ 

40320c* 


and X = (a — c)/-\/c. 



INVERSION FORMULAS 


421 


For the <-variate, the formula is applied ia the reverse direction since Hotelling 
and Frankel supply the first four values of , that is, in present notation, the 
series 


g X 


x’ + a: , 13a:“ + 8*® + 3a; 35a:’ + 19a;* + a:* - 15a; 

-IT -48- 2 64-6 

6271x* + 3224x’ - 102x* - 1680a;* - 945a: 

3840 24 


The reversed series (obtained by (4)) is 

, p* + ? . 5(7° + + Zgy^ , 3g’ + 19^* + 17(7* - 15(7 / 

—^y + -48- 2 + -64-6 

, 79^* + 776(7’ + 1482p* - 1920(7* “ 945,7 y^ , 

+ -38i0- 

The first three terms are checked by Goldberg and Levine (l.c.). 

Another application worth noting is to the formulas of Cornish and Fisher 
which give Gy{g) and XJ^x) in terms of the relative cumulants of the distribution; 
to save space these are omitted. 

The derivation of the formula may be indicated most easily by Lagrange’s 
formula for the expansion of one function in powers of another in the following 
form’: 

Let C be a contour in the complex z plane enclosing the point z = x, and let 
/(«) and <^{z) be analytic on and inside C, Let y be such that | y<i>{z) | < | a — x | 
when z is on C, and g be that root of the equation: 

(5) g = x + y<i>{g) 
which lies inside C. Then 

(6) j{.g) = ^ f /(2) y (log [z - X - yi>{z)]] dz = f{x) + Z) Xt(x)y''/n\ 

27 r^ io dz \ 


where 

(7) Xl{x) = (:r)(0(x)r] 

The contour integral in (6) appears, slightly disguised, as a problem in Whit¬ 
taker and Watson [Modem Analysis, Cambridge, 1920, p. 149]. The evaluation 

(7) is given for completeness, though no use is made of it in this section, the 
derivation proceeding directly from (6). 

First notice that by (1) and (5) 

-y<t>(.g) = ^ G„{g)y''/n[, 

1 


^ The author owes the suggestion for this to S O Rice, who also simplified the derivation 
of the second inversion formula given later. 



422 


JOHN BIOHDAN 


SO that the logarithm in (6) may be written 

ae 

log (2 - a: + £ G'„( 2 )j/”/nl), 

1 

or 

log (2 - a:) + log [1 + £ <?«(«) (2 - a;)~V7^lJj 


or 

(8) log (2 — a:) + log exp hy, 
with b a symbolic variable such that 

b“ s bo = 1 

b" = Gn(z)iz - xr\ 

Now if 

(9) log (exp by) = Biy + Biy^/2\ + ••• , 

= exp By, 

B being another symbolic variable, J5o = 0, B" ^ Bn , it follows from equation 
(5) of [8] that 

Bn = log (exp bj/)]y_o, I>„ = d/dy, 

= Y„{pbi, pbi, • • • 0bn) 

= ^fi,YnAbi,bi, bn), 

1 

with p, = (—— 1)! and F„,< the part of polynomial Y„ having i parts, 
as defined above. Moreover, each factor b* of terms in y„,, contributes 
Gkiz){z — x)~^ so that 

(11) Bn^t, P;iz - x)-' YnAOliz), ft(2) • • • On{z)) 

1 

Then, by (6) 


= j{x) - ^ /^/'(*') 


— x)‘ 


dz 



INVERSION FORMULAS 


423 


- ? s S L • • • «■<’»/'« 

= m - E ^ 1: (-Dy-Hfix)Yn,mx) ■ ■ ■ G.ixm 

1 n\ 1 

with D = d/dx. The evaluation in the last line is by the Cauchy formula for 
derivates; the second line is derived by aii integration by parts. 

Equation (4) follows from this and the substitution f{g) = g. 


4. Second Inversion Formula. This gives the interrelations of coefficients of 
series like the two Campbell series mentioned in the introduction. It runs as 
follows: 

li • ■ ■ are given polynomials and if 


( 12 ) 


= cE 
0 


g»(g)c' 

n\ 


—Jn 


defines a in terms of g and a parameter c; then 

' Pn(g)a~^ 


(13) 


= a E 


n\ 


where 

(14) -Pn(g) = Yniaqiig), aqiig), ••• , ag„(g)) 

with a = oci = 1; a’ = a, = (n — 4)(n — 6) • • ■ (n — 2t)2 

Equation (14) is formally similar to (3) and by symmetry as before, g„(g) is 
readily expressible as a F„ polynomial in pi(jg) to Pn(g).. 

The first five instances of (14), dropping the argument for brevity, are 


-Pi = 3i 

—Pj = q 2 — ql 

— Pa = gs — fgzgi + f gi 


-p* = 

-P6 = 36 + I (3*31 + 2g332) - i (23»3 i + Sg’si) 

+ ^323*1- If 3‘ 

Applied to Campbell’s first series where 

31 ( 3 ) = 9 Q*(s) = ( 3 ’ — 73)/6 

32 ( 3 ) = f ( 3 ^ - 1) 34 ( 3 ) = (-123* - 283 “ + 64)/135 

34 ( 3 ) = (303* + 10243 ’ - 17323)/1296 



424 


JOHN BIORDAN 


these show that 

Pi(ff) = -g Pt(g) « (ff’ + 2fif)/12 

P 2 (g) = ((/“ + 2)/3 p<(g) = (12p' + 28ff* - 64)/136 

pi(g) = (207g‘ + 2596p’ - 6148i;)/1296 

The proof of (14) is as follows. First, for brevity introduce symbolic variables 
p and g with the usual interpretation p" s p«(g), g" es g„(ff) so that (12) and 
(13) read 

a = c exp g c~* 
c = a exp p a“* 

Now write o = l/a;“, c = 1/p* changing these to 
a: = p (exp gp)~‘ 
p = a! (exp pa:)”* 

and note that 

(15) x*p“* « (exp gp)"* = exp px 

which shows that p, is the coefficient of aiV^^l in the expansion in powers of z 
of (exp gp)“*. Lagrange’s formula gives at once (D d/dp): 

(16) /(p) = Z n”-^[f'(g )(exp gp)*"],^ 
so that 

(exp gp)“’ = Z ^ />’*”'[-(exp gp)*‘’’“^>Z)(exp qp)h-o 

si (i^) 

or 

(17) -Pn = (^"(exp gp)‘''‘~*’]„_0 

= Y„(agi, agi, ••• ,ag„) 
with a, as in (14), by equation (5) of [8]. 

BIBLIOGRAPHY 

[1] Bell, E T., ‘‘Exponential polynomials,” Annala of Math. Vol. S5, (1934), pp. 2S8-277. 

[2] Campbell, G. A., ‘‘Probability curves showing Poisson’s exponential summation,” 



INVERSION POBMULAS 


425 


Bell System Tech Jl. Vol S, (1923), pp. 95-113; Collected Papers, New York, 
1937, pp 224-242.' 

[3] Cornish, E. A and Pisher, R A., “Moments and oumulants in the specification of dis¬ 

tributions,” Reme de I'InsUtut Intern de Stat Vol. 4, (1937), pp. 1-14. 

[4] Goldberg, H and Levine, H.,, “Approximate formulas for the percentage points and 

normalization of i and x*>’’ Annals of Math Stat Vol 17, (1946), pp. 216-22S 

[5] Hotelling, H. and Frankel, L E , “The transformation of statistics to simplify their 

distribution,” Annals of Math Slat. Vol 9, (1938), pp 87-96. 

[6] Kendall, M G., The Advanced Theory of Statistics I, London. 1943. 

[7] Peiser, A. M , “Asymptotic formulas for significance levels of certain distributions,” 

Annals of Math Stat Vol 14, (1943), pp 56-62 

[8] Eiordan, John, “Derivatives of Composite Functions,” Bull. Am. Math. Soc. Vol SS, 

(1946), pp 664-667. 



ON THE DETERMINATION OF OPTIMUM PROBABILITIES 
IN SAMPLING 


By Mqhkis EI, Hansen and William N. Hoewitz 
Bureau of ihc Census 


1. Summary. In a previous paper [2] it was slioAvn that it is sometimes 
profitable to select sampling units with probability proportionate to size of the 
unit. This note indicates a method of determining the probabilities of selection 
which minimize the variance of the sample estimate at a fixed coat. Some ap¬ 
proximations that have practical applications are given. 


2. Introduction. Neyman has shoivn that it is possible to reduce the sampling 
variance of an estimate by dividing a population into sub-populations (called 
strata) and varying the proportions of units included in the sample from stratum 
to stratum [1]. His treatment presumed that the units within any stratum would 
be drawn with equal probability. In many practical sampling problems, the use 
of constant probabilities is neither necessary nor desirable. Not only is it possible 
to obtain unbiased or consistent estimates with varying probabilities of selection 
of the sampling units, but also it is possible to reduce the variance of sample 
estimates by appropriate use of this device. 

It has been shown [2] that in a subsampling system, the selection of primary 
units with probabilities proportionate to the number of elements included in the 
primary unit may bring about marked reductions in sampling variances over 
sampling with equal probabilities. In this note, wo shall indicate a method of 
determining the optimum probabilities under certain conditions, and also some 
approximations to the optima that have practical applications. 

By optimum probabilities, we mean the set of probabilities of selection that 
will minimize the variance for a fixed cost of obtaining sample results, or alterna¬ 
tively that will minimize the cost for a fixed sampling’error. 


3. Optimum probability with a subsampling system. Consider, for example, 
the simple subsampling system where primary unite are first drawn for inclusion 
in the sample and then a sample of elements is drawn from the selected primaiy 
units. We shall suppose, for simplicity of notation, that the sampling is done with¬ 
out stratification, The conclusions indicated below will bo similar if stratified 
sampling is used, and they will hold even if only one unit is drawn from each 
stratum. Suppose that a population contains M primary units, and that the 
sampling of primary units is to be done with replacement. Sampling with re¬ 
placement is assumed in order to simplify the mathematics. We wish to estimate 
the ratio 


X 

Y 


M Ki 

M 


i-i 1-1 



OPTIMUM PBOBABIUTIES IN SAMPLING 


427 


where and F.j are the values of two characteristics of the jth. element within 
the fth primary unit, and N,, is the number of elements in the ith primary unit. 
A consistent estimate of X/Y is given by 


( 1 ) 


^ _ t“l A i Ti% j*! 

i"l it j—1 


where 

P, = The probability of selecting the fth primary unit on a single draw 

n, = The total number of elements included in the sample from the eth unit 
if it is drawn. If a particular unit happens to be included in the sample 
more than once the subsampling will be independently carried through 
each time it is drawn. 

m = The total number of primary units included in the sample. 

It will be assumed that a self-weighting sample is to be used, i.e., that although 
the probabilities of selecting primary units will vary, the subsampling rate 


within the ith selected primary unit, , will be such that P, = h. Note that, 


with this condition, h is the probability that an element will be included in the 
sample by making a single draw of a primary unit, and by carrying out the speci¬ 
fied subsampling within the selected primary unit. It follows that mkN is the 
expected total number of elements included in a sample of primary units, where 


N = iN.. 

i-i 


The method can be extended to cover situations where other conditions are im¬ 
posed. 

We shall express the variance of r in terms of P,, m, and /c, and also express 
the cost in terms of these same quantities. The optimum values of P,, m, and h 
will then be determined 

The variance of the sample estimate. To terms of order 1/m of the Taylor ex¬ 
pansion of a ratio, the sampling variance of the estimate (1) is approximately 


( 2 ) 


- n. I 

^ ' AT „ 

^"1 ir f • iV % Th\ 





X, 


N^ 

ZX., 

7^1_ 

N^ 


where 




The cost function. Now suppose that the total cost of the sampling procedure 
involves a fixed cost attached to each primary unit included in the sample, a 
cost of listing the elements witliin each selected primai-y unit (this listing may be 
necessary in order to draAV a subsample), and a cost of obtaining information from 
each of the elements selected for inclusion in the sample. Under these circum¬ 
stances the total expected cost of the survey will be: 

( 3 ) C = C^m + CimT, PiN.-V CimkN 

1-1 


where 

The fixed cost per primary unit, 

The cost of listing one element in a selected primary unit and other 
costs that vary with the number of elements to be listed, 

The cost of obtaining the required information from one element in 
the sample, 

Expected number of elements in the sample per primary unit in 
the sample. 

The over-all sampling ratio, and 

M 

TjPli — The total number of elements in the population. 

It will be noted that although the values of Pi and m may be fixed in advance, 

tn 

the number of elements to be listed, £ iV,, remains a chance variable. It is for 

this reason that we consider the expected cost rather than the actual cost. 

The optimum values of P,, m, and k. The values of P,, m, and k which min¬ 
imize the variance (2) subject to the conditions that: 

^Pi = k, tp^ = h 

Jy* »-i 


Cl = 

c, = 
Cz = 

£ p,iv. = 
1-1 

mk = 
N = 


C is fixed, 



OPTIMUM PROBABILITIES IN SAMPLING 


429 


are given by 


(4) 


(5) 


( 6 ) 

where 


P, = 






C^ + C,N, 




NlS, 


+ C^Ni 


'Eat. a? 

1-1 


h = 


N 


S/c: 




+ C,N. 
C 


c, + ft E ftiV. + C,kN 
1-1 




= A^ 


N,' 


Ordinarily <S, will be positive although it will often be found to be negative for 
some i. For a great many populations, such negative values can be avoided by 
classifying the primary units into size groups or other significant groups and then 
requiring that the probability of selection be Pa for every primary unit in the 
a-th group. 

In actual practice, however, in advance of designing a sample one does not 
have the data to compute the optima and uses methods of approximating the 
optimum probabilities. Methods of approximating the optimum probabilities are 
given below, 

4. Some rules for approximating the optimum probabilities. In another 
paper [2] considerations were presented from which it follows that S* tends to 
decrease with increasing size of unit, but seldom as fast as the size of unit in¬ 
creases. The rate of decrease is often small relative to the increase in Ni , and 
empirical data for a number of problems indicate that even the assumption of 
5, being fairly constant with increasing size of unit may not lead one far astray 
from the optimum probabilities. Under this assumption (fi, = 6 for all f) the 
probabilities depend only on Ni, ft, and ft, and lead to the following results: 

(a) When ft > 0 and ft = 0, probability proportionate to size will be the 
optimum. 

(b) When ft = 0 and ft > 0, probability proportionate to the square root 
of the size will be the optimum. 



430 


MORRIS H, HANSEN AND WILLIAM N. HURWITZ 


If we go to the other extreme (extreme not in terms of mathematically possible 
values but in teims of most practical populations), and assume that 5, decreases 
at the same rate that Ni increases, the results would be 

(a) When Ci > 0 and Ci = 0, probability proportionate to the square root 
of the size will be the optimum. 

(b) When Ci = 0 and > 0, equal probability will bo the optimum. 

The minimum is broad in the neighborhood of the optimum and the results for 
either of these extremes and the values in between often will give results reason¬ 
ably close to the minimum, This leads to the following useful approximations: 

(a) When ZP.Af,, the expected cost per primary unit of listing and related 
operations, is small in relation to Ci , the fixed cost per primary unit, the 
optimum probabilities will be between probability proportionate to size 
and probability proportionate to the square root of size, and either of these 
will be reasonably close to the optimum. 

(b) When C, is small compared to , the optimum probability will be 

between equal probability and probability proportionate to the square root 
of size, and either of these will be reasonably close to the optimum. 

(c) When both Ci and Ci'ZP,N{ are of significant size, i e., when the costs 
vary substantially both with the number of primary units in the sample 
and the size of the units, then probability proportionate to the square root 
of the size will bo a reasonably good approximation to the optimum. 

(d) When units of small size are used and all of the subunits in the selected 
primaiy units are included in the sample (that i.s, there is no subsampling) 
equal probability is close to the optimum. It should be noted that this 
rule does not follow directly from the above analysis based on subsampling, 
but from a separate analysis in which no subsampling is involved. 

For whatever system of probabilities is used, and with the cost function given 
by (3), the optimum value of k is given by: 



which can be approximated, in application, from prior experience or preliminary 
studies. The corresponding optimum value for m is obtained by substitution in 
the cost function 

The above results should not be accepted, of course, as the optima for every 
cost function or every sampling system. Either past experimental data may be 
available or pilot tests made to determine the cost function and the appropriate 
approximations that should be used in various practical situations. 

An illustration An illustration may be of interest. A characteristic pub¬ 
lished for city blocks in the 1940 Census of Housing is the number of dwelling 
units that are in need of major repairs or that lack a private bath. Suppose we 



OPTIMUM PROBABILITIES IN SAMPLING 


431 


were sampling to estimate the proportion of the dwelling units having this char¬ 
acteristic for the Bronx in New York City, at the time of the 1940 Census. Let 
us assume that once we selected a system of probabilities we used the optimum 
numbers of blocks and the optimum sampling ratios appropriate to these proba¬ 
bilities, that is, the optimum values of k and m. For each of several cost func¬ 
tions the following Table 1 shows the sampling variances of each system, rela- 


TABLE 1 


Unit costa 

Average cost per primary unit of 
listing and related operations 
(CiSP,N,) 

Variances relative to equal 
probability 

C. 

Ci 

Cl 

Equal 

proba¬ 

bility 

Probability 
propor¬ 
tionate 
to square 
root of size 

Proba¬ 
bility 
propor¬ 
tionate 
to size 

Equal 

proba¬ 

bility 

Probability 
propor¬ 
tionate 
to square 
root of size 

Proba¬ 
bility 
propor¬ 
tionate 
to size 

5 

.10 

1 

13.49 

21.15 

27.63 

100 

92 

104 

5 

.05 

1 

6.75 


13.82 

100 

88 

97 

5 

.02 

1 


4.23 

5 53 

100 

83 

87 

5 

0 

1 




100 

75 

73 

2 

.10 

1 

13.49 

21.15 

27.63 

100 

96 

111 

2 

.05 

1 

6.75 


13.82 

100 

93 

106 

2 

.02 

1 


4.23 

5 53 

100 


97 

2 

0 

1 




100 

79 

77 

1 


1 

13.49 

21.15 

27.63 

100 

97 

114 

1 


1 

6.75 


13 82 

100 

96 

no 

1 


1 


4 23 

5 53 

100 

93 

103 

1 


1 

mm 



100 

82 

81 


.10 

1 

13.49 

21.15 

27.63 

100 

99 

117 


m 

1 

6.75 


13 82 

100 

99 

115 


n 

1 


4 23 

5.53 

100 

99 

113 


tive to the variance of sampling with equal probability. It also shows values of 
CiEPiNi for comparison with Ci. 

Some of the costs given in the table do not have unreasonable relationships 
in terms of the situations encountered in practice in various types of jobs. The 
comparisons are not affected by the absolute magnitudes of the costs but only 
by their relative magnitudes The results are consistent with the rough rules of 
thumb given above. It is worth noting that in each of the above instances prob¬ 
ability proportionate to the square root of the size yields a comparatively low 
variance 
















432 


M0ER18 H. HANSEN AND WILLIAM N. HURWITZ 


6, Sampling with or without replacement In this paper the sampling with 
varying probabilities was assumed to be carried out with replacement which 
ordinarily would not be advisable in practice. When sampling is done without 
replacement the optimum probabilities and their approximations will be about 
the same as for sampling with replacement in at least those instances where the 
proportion of the population in the sample is small Further investigation is 
needed for large sampling rates. 

6. Conclusion. In summary, it is not essential and may not be desirable to 
give each element in the population (or stratum) the same chance of being drawn 
m order to avoid bias or to have a consistent estimate. Estimate (1) is a con¬ 
sistent estimate no matter what probabilities of selection are assigned to these 
units. The use of variable probabilities of selection is another device to be added 
to those already m the literature, such as stratification and efficient methods of 
estimation, which make it possible to achieve the objectives of a sample survey 
at reduced costs. Reference [2] gives another illustration of reductions in sampling 
variance achieved through the use of varying probabilities in accordance with 
the rules suggested above for approximating the optimum probabilities. 


[1] tax Nexman, "On the two different aspeots of the representative method of purposive 

selection," Roy. Staf. Soc Jour., Now Series, Vol 97 (1934), pp. 668-606. 

[2] Morris H, Hansen and Widliam N. Hurwitz, "On the theory of sampling from finite 

populations," Annflfs of Math, SlaL, Vol, 14 (1943), pp. 333-362 



A SOLUTION TO THE PROBLEM OF OPTIMUM CLASSIFICATION 

By P. G. Hoel and R. P. Petehson 
University of California, Los Angeles 

1. Summary. By means of a general theorem, the space of the variables of 
classification is separated into population regions such that the probability 
of a correct classification is maximized. The theorem holds for any number of 
populations and variables but requires a knowledge of population parameters 
and probabilities. A second theorem yields a large sample criterion for deter¬ 
mining an optimum set of estimates for the unknown parameters. The two 
theorems combine to yield a large sample solution to the problem of how best to 
discriminate between two or more populations. 

2. Introduction. There are essentially two basic problems in discriminant 
analysis. The first problem is to test whether the populations differ, since it 
would be futile to attempt a classification if the populations did not differ. The 
second problem is to find an eflicient method for classifying individuals into their 
proper populations. In this paper, an optimum asymptotic solution of the 
second problem will be presented. 

3. Parameters known. Let /» = ft{xi , • • • , a;*), (t = 1, , r) denote the 

probability density function of population i in the region under consideration. 
Let p, > 0, (i = 1, • ■ • , r), denote the probability that population i will be 
sampled if a single individual is selected at random from that region, and let R 
denote the h dimensional,Euclidean variable space Then the desired theorem 
is the following; 

Theohem 1. If M, denotes the region in R where p,/, > p,/,, (j = L ■ ‘ 
and where pj, > 0, then the set of regions Mi, {i = 1, • • • , r), in which any 
overlap is assigned to the M, with the smallest index, will maximize the probability 
of a correct classification. 

For the purpose of proving this tkeorem, consider any other set of non¬ 
overlapping regions, M't. Since the addition to any of the regions Jlfi of a part 
of R throughout which all the functions /, vanish will not affect the probability 
of a correct classification, there is no loss of generality in assuming that the set of 
regions M', contains the same portion of R as the set of regions M, does. The rela¬ 
tionship between the two sets may be expressed by means of the formulas 

(1) Mi = £ M„ 
and 

(2) M',= ZM„, 

where M,j denotes that part of M, which is contained in M', . 

433 



434 


P. G. HOKIi AND n. P. PKTER 80 N 


Since a sample point that falls in tlie region Mi will be judged to have come 
from population i, the probability of the correct classification of a single random 
sample by means of the set Af, is given by 

( 3 ) Q = Pif hdE+‘-‘+v.f frdE, 

J.V| JUr 

where dE ~ dxidxi ■ ■ ■ dx, .li Q' denotes the probability of the correct classifica¬ 
tion by means of the set M[, 

Q' = Pif f^dE+---+prf SrdE. 

J Mr 

In the notation of (1) and (2), these probabilities become 

Q = Pi f fidE+--- + Pr[ frdE 

JSWl, JXUrj 

and 

Q' — Pi [ fi dE -{-••• + Pr /" fr dE. 

Now consider the difference Q — Q'. It can bo expressed in the form 

Q - = E E [p. f fidE ~p,[ fsdE 

>“1 y-i L ''V,, Jmh 

= E E f [p.-/. - Pif)] dE. 

1-1 )-l J Mii 

Since Af.j is contained in Af, and p,/, > p/„ {j = I, , r), holds throughout 
Af,, it follows that each of these integrals is non-negative, consequently Q > Q\ 
which proves the theorem. 

This theorem yields a solution to the classification problem only when the f, 
are completely specified and the pi are knoivn. 

It will be observed that this theorem is similar to a generalization of a funda¬ 
mental lemma in the Neyman-Pearson theory of testing hypotheses [1], and to a 
result by Welch [2]. 

If the basic weight function in Wald’s [3] formulation of the multiple decision 
problem assumes only the values 0 and 1, corresponding to whether or not a 
correct classification is made, it will be found that the set of regions Mi will 
minimize the expected value of the loss in that formulation. 

4. Parameters unknown. Since the p,, as well as the parameters in the /,, 
are assumed to be unknown, Q will be a function of such parameters. Let di, ■ • • , 
6, denote all such parameters, including the p,. Now let a random sample of size n 
be taken from the region under consideration and let di, • ,B, denote a set of 



OPTIMUM CLASSIFICATION 


435 


estimates of the parameters based on this sample. Since the total sample will 
constitute a sample of size Wi from/i, from/ 2 , etc , where n = • • • + n,, 

the d’e for /, will be estimated by means of a sample of size rather than of size n. 
In the following arguments, it will not be necessary to distinguish betw'een 0 ’s 
which are estimated by different size samples because the arguments will be 
based on the order of terms with respect to the size sample and n, ~ np, with 
probability one. Or, more simply, choose all n, equal. 

Let Ml correspond to M, when the parameters are replaced by their sample 
estimates and let Q denote the probability of a correct classification when using 
the regions M, in place of the regions Mi. Then, from (3), 

Q-Q=t,pJ[ fidE - fids]. 

Let H = Q — Q Since the estimates, 0*, are random variables, H will be a 
random variable which is a function of the estimation functions, 6 ,, as well as 
of the parameters, 0, • The desired criterion for determining optimum estimates 
IS then given by the following theorem: 

Theohem 2. If E 0 i — 0,)^ = 0 { rr ‘‘), 0 > 0, and if in some neighborhood of the 
point 0 , = 0,, (i = 1, • , s) the function H is continuous and possesses continuous 

derivatives of the first, second, and third order with respect to the 6,, then 

m) = 5 4 Z Hi,E0i - 0 ,)( 0 , - 0 ,) + 

where denotes the partial derivative of H with respect to B, and 0, at the point 
(01 , ■ ■ • , 0 .) 

The proof is similar to the type of proof used by Cramer [4] to obtain an 
expression for the variance of a function of central moments. 

By means of Tchebycheff’s inequality [4], page 182, it follows that 

F[(0. - 0.)* > e‘] < 

From the theorem assumptions, there exists a constant A such that 

P[0i - 00* > e^] < 


This is equivalent to 

P[10.-0.| > 0 


If El denotes the set of points in sample space where 10, — 0i 1 < e, (i = 1, • • • , s), 
and E 2 denotes the complementary set, this inequality implies that 


pm < 



(4) 



436 


P. G. HOEL AND B. P. PETERSON 


The expected value of H may be written in the form 

(5) E{H) = f HdP-\- f HdP. 

Consider the order of the second integral. From (4) and the fact that H is the 
difference of two probabilities, it follows that 

If H dP < [ dP =- P[Fi] < 

|Jj!i e 

Consequently (5) becomes 

(6) E{H) = f HdP + 0(n'"). 

Now consider the first integral. From the theorem assumptions, if e is chosen 
sufficiently small, it follows that for any point in the set Ei, the function H 
can be expanded in the form 

H = H(e) + i(h- edH.ie) + e,){h - 6i)HM + R. 

where d denotes the point (5i, • • • , 6^), where 

R-i±i± (&, - ediOj - «y)(§* - ek)Hi,kie'), 

0 111 

and where 6' is some point in Ei. Since Q reduces to Q when 9 = d, H(d) — 0. 
Furthermore, since Q denotes the maximum probability of a correct classification, 
/f ^ 0 for all 6; hence ff,(9) = 0 and H„(0) > 0 for all i. Thus, for any point 
in the set Et , 

5 L i: (0. - BXl - + R. 

2t 1 1 

If this expression is substituted in (6), will become 

(7) E{n) = I'ti: H.M f (0. - eM -ej)dP+ [ BdP + oirfo). 

* 1 3 Jii Jsi 

Consider, first, the order of the remainder term. From the continuity assump¬ 
tion on H,jk , it follows that Hijk is bounded in Ei, say 1 Hish{9') \ < B', hence 

\f ih- e.) 0; - 0y) ih - ek)Hijk(e') dP <B f \{h- ih - 0y) (h - «*) I dP. 

By Schwarz's inequality, 

f |(e.’-0O(0y-0y)(0*~0*)|dP 

JSi 

< [/^ (0. - 0i)“(0, - 0y)' dP £ (0k - Bkf dp] . 



OPTIMUM CLASSIFICATION 


437 


Similarly, 



Since 



ey dP < 


L 


«i+ii 


(e. - 0.)‘ dP = 0(n-"), 


the preceding inequalities combine to give 
(8) 1 f dP 

Now consider the first integral in (7). It may be written in the form 


= OCn"”'"'). 


(9) f (0. - 0<)(0,-0,) dP = - s.)(e, - e,) - f (6, - d,)(e, - e,) dP. 

Jei Jsi 

By Schwarz’s inequality, 

If (e. - eXh -e,)dp\<[[ (0. - 0.)" dP f ( 0 ^ - e,y dpl‘. 

1 *'•*2 1 L*''®2 J 

Similarly, 

(0. - 00“ (». - dP-PlSj]*. 

If these inequalities are combined and inequality (4) is employed, (9) wili 
reduce to 


(10) f (0.-0.)(0,-0y)dP = i;(0,-0.)(0,-0O + O(n.“"). 

Jsi 

Finally, if (8) and (10) are employed in (7), it will reduce to the result stated 
in the theorem. 

The order of the leading term in E(H) depends upon the nature of the esti¬ 
mating functions, 0i. In order to insure that this term will be the dominating 
term, and thus rule out pathological situations, only that class of estimating 
functions (estimators) will be considered for which this term will be of lower 
order than that of the remainder term. If the estimators are means or central 
moments, for example, then g = 2. For such estimators the order of the remainder 
term is 0(»i“*), whereas the order of the leading term is not higher than 0 («“*). 

A set of estimators will be called an optimum set if it maximizes the expected 
value of the probability of a correct classification, or, what is equivalent, if it 
minimizes E{H). Since only large samples are being considered here, it is neces- 



438 


r. a. HOEL A.VD R. P. PElTERSn.N" 


sary to define optimum in an asymptotic sense. Con.sider sets of estimators for 
which E{H) is of order For this class of cstimatoni, a net will be called 
asymptotically optimum if it minimizes 

lim n'^im 

Among asymptotically optimum sets of various orders, the set corresponding 
to the highest order would naturally be considered jas the best fusymptotic set. 
Now from Theorem 2, it readily follows that a set of estimators which minimizea 

(U) 

1 1 

w'ill be an asymptotically optimum set. 

6. Maximum likelihood estimates. If the cstiinate.s h are unbiased and uncor¬ 
related, (11) W'ill reduce to 

( 12 ) 

1 

where tr? => i?(5, - 9tf is a function of n as w'ell as of the paramelcrs. Since, from 
the discussion preceding (7), H„ > 0, it follows that (12) will be a minimum when 
the O'! assume their minimum values. Now it is known. [4], page 504, that under 
mild restrictions maximum likeliliood estimates posse.as minimum asymptotic 
variance.s; hence for e.stimators of the type being considered which also satisfy the 
conditions in [4], the maximum likelihood estimates of the Si will yield an 
asymptotically optimum set of estimates for the classification problem. 

KKPEEKNCES 

[]] J. Ndymak and E S. Pearson, “On the problem oE the inofit eflicicrit teats of slatiatical 
hypotlieses," lioi/. Soc. Phil. Trans., Vol. 231 (1933), pp, 289-337. 

[2] B, L. Welch, “Nolo oadisciimmaiit EunctioiiB," Biomlnka, Vol, 31 (1939), pp. 218-220. 

[3] A, Wald, “Contributions to the theory of statiatical estimation and testing hypothe¬ 

ses,’’ Annak of Math Slat,, Vol. 10 (1939), pp. 209-304, 

[4] H CsAM^iH, Malhemalkal Methods of Staiislm, Princeton University Press, 1946, 

pp. 352-356. 



NOTES 

This section is devoted to brief research and expository articles on methology and 
other short items. 

A GENERALIZATION OF WALD’S FUNDAMENTAL IDENTITY 

By Gunnar Blom 
University of Stockholm 

1. Summary. The fundamental identity is generalized to the case of independent 
random variables with non-identical distributions The conditions for the 
validity of the differentiation of the identity are discussed. The results given in 
[1], [2], and [3] arc obtained as special cases. 

2. A property of cumulative sums. Let , zj, • • • be an infinite sequence of 

independent random variables, Fi{z), Fiiz), • ■ • their distribution functions (d.f.) 
and (pi(t), ‘ ‘ their moment-generating functions so that = E(e*'’')- 

as and bs are given constants (as > bs, N = 1, 2, •••). n is defined as the 
smallest integer N for which Zs = Zi + • • • + Zjv is ^ Ow or g . 

We first give two lemmas. 

Lemma 1. If two positive quantities S and e can be found such that one at least 
of the following conditions a) and b) are satisfied 

a) P(z, > 6) > € for all v and hm sup as ^ 

N-*io 

b) P{z, < —8) > efor all v and lim inf bs > — 

JV-M50 

then for any fc ^ 0 

(1) lim AT* P(n > iV) = 0. 

An inspection of the proof of (4) in [4] shows that this formula holds when the 
conditions of the lemma are satisfied. The lemma follows. 

Lemma 1 can be generalized as follows. 

Lemma 2. If two positive quantities 8 and t and a sequence ci, ci, • • ■ can he 
found such that one at least of the following conditions a) and b) are satisfied 

N 

a) P{z, + c, > 8) > e for all v, lim sup a^ < ^, lim sup 2] c, < w, 

Jl-fx JV-*« 1 

b) Piz, + < —8) > efor all v, ^ 

lira ini bn > — «, lim inf c, > — «, 

^-*00 1 


then (1) is true. 


439 



440 


GUNNAR BLOM 


Proof: In case a) we put 2^ = z, + c,, Zi = S zi and => oy + c,. 
The inequality Ztf S ay then becomes Zh ^ ah . Ab Piz', > d) > e and lim sup 

a'y < °°, Lemma 1 can be applied to the sequence z'l, z'l, , and thus (1) is 

true. When conditions b) are satisfied, the proof is analogous. 

3. The generalized fundamental identity. In this section we shall consider 
sequences of random variables of the type defined in Lemma 2. We shall prove 
two theorems the first of which is valid for complex values of t and the second 
only for real values of t. 

Theorem 1. Assuming that 

1 °, one at least of conditions a) and b) of Lemma 2 is satisfied-, 

2°. b ^ by < ay S a, where a and b are finite, 

3°. for some com-plex (or real) value of t, exists for all v and is 5 ^ 0 and 
lim inf | •‘<Ptt(t) I > 0 , 

then 

( 2 ) £[e'^»(w«) “ 1 . 

Proof. Let Wm denote the set of all sequences zi - ■ Zy in the W-dimensional 
Euclidean space Q.y for which n = m (m ^ N), Wh the projection of Wm on £!« 
and all sequences for which n > N, We have identically 

r L f + / 1 dPt . dF;. = f dFi • ■ • dFy = •• • vy(t). 

Lm-l JlTm J’r„>yj Joy 

Dividing by the right member and cancelling common factors we obtain 

i (<Pi - ■ ■ 

(3) «-i 

+ (?>i• • • vu)~^ f dFi‘-- dFy = 1. 

When iV —> « the first sum tends to the left member of (2). We thus have to 
investigate the last term in (3) which we denote by Ry. We can write 

Ry=(n--- vv)"' f dFi--- dFy 

= (<fii-‘-<Pu)-'P(n> N)En>ye‘^''. 

It follows from Lemma 2 that P(n > N) —¥ 0. As b < Zy < ahy 2° we conclude 
that Ry —» 0. This proves the theorem. 

Theorem 2 . If, for some real value of t, <p,(t) exists for all v and if quantities 
c,, t > 0 and 5 > 0 can be found such that at least one of the following conditions 
0 ) and b) are satisfied for all v 

N 

o) lim sup ay < 00, lim sup 2 ] c, < » and 

AI-*» I 

Aft, s )= r 

iPrW J«-c, 


( 5 a) 


(v = 1, 2, •• •), 



wald’s fundamental identity 


441 


If 

h) lim inf Sat > — , lim inf ^ c, > — oo and 

(5b) ' S) = e“ dFXz) > e, (^ = 1, 2, • • •), 

then (2) holds. 

The conditions of the theorem become more attractive if the theorem is 
limited to the somewhat less general cases mentioned in the Corollary below. 
The above formulation has been chosen mainly because of an important applica¬ 
tion to identical variables in Sec. 6. 

Proof. The theorem is proved if we can show that Rjf in (4) tends to zero when 
N . For that purpose we use the transformation (cf [5] and [3]) 

(6) GXe-, t) = I’ e" (r = 1, 2, ■ ■ ■)• 

G,{z\ t) is obviously a d f. for every real t (for which exists). When (5a) 
holds, 

P{z, + c,> S\ Q,(z-, 0] = A{t, fi). 

Here the expression m the left member denotes the probability that 4- c, > S, 
when (t» is the d.f. of z,. 

Consequently, when conditions a) are fulfilled, a sequence of random variables 
with the d.f :s <?i(z; t), Qi{z-, t), • • • or, with one notation, G{t) satisfies the con¬ 
ditions a) of Lemma 2. It follows that 

lim Pin > N\ Git)) == 0. 

Introducing G,iz] t) in Ry we find 

Ry = [ dffi • • • dGy = Pin > N\ Git)). 

Consequently Ry —> 0. When conditions b) are fulfilled, the proof is analogous. 

Corollary to Theorem 2. If 1° (p,it)e‘°’ ^ Hit) < =», 2° t is positive and 
conditions a) of Lemma 2 hold or t is negative and conditions b) of Lemma 2 hold, 
then the generalized fundamental identity is true. 

For, in the first case 

so that (5a) is satisfied, and similarly when t is negative. 

The following special case deserves particular attention as it covers most 
cases occurring in practice and the conditions become very simple: If a sequence 
of random variables satisfies conditions a) and b) of Lemma 1 simultaneously, a 
sufficient condition for the validity of (2) for some given real value of t is that the 
sequence tp„it) is bounded. 



442 


GUNNAR BLOM 


4. Application to Poisson variables. As an application of (2) we consider a 
sequence of Poisson variables with the parameters 'Km,, where X is a positive 
quantity and m, arc positive integers. From the welhknoAvn formula 

P,(t) = 


we easily conclude that the conditions of Theorem 1 are valid if /2(e‘) ^ 1. (With 
S < 1 in (5a) we find that (2) holds even for negative t .) If, in particular, we 

choose i so that e' = 1 + = Ck, we have the simple formula 

E(c^'') = l, (fc=l,2, ••). 


5, Differentiation of the generalized fundamental identity. In this section t 
IS assumed to be real, We denote the fcth derivative of by V’S*’'(0- We shall 
prove the following theorem which corresponds to Theorems 1 and 2. 
Theohem 3. If for all t in a closed interval I the conditions stated in Theorems 


1 or 2 are satisfied and if, in addition, the functions 


are uniformly bounded 


with respect to both v and t {in I) for k = 1, 2, ■ • • r, then the generalized funda¬ 
mental identity may be differentiated r times with respect to tfor any t in the interior 
ofl 

We use a method of proof which is similar to that used in [2], We first show 
that the sum in (3) may be differentiated r times under the integral signs and 
secondly that the rth derivative of Ru tends to zero uniformly in t when W —> . 

The rth derivative of the general term of the series in (3) consists of a finite 
number of terms of the form 


Jmii) = {ipi“' ipm) f , dFi • • • dFm ill g X; /i, X = 1, 2, • ■ • r). 


and the rth derivative of Rff in (4) consists of a finite number (which does not 
depend on N) of similar expressions with N substituted for m and 'Prn>jr for 
Wm. H„ is & sum of mf and W'' terms respectively which is symmetric in v. 

{k ^K]v = 1,2, • m) and are thus major- 




The terms are functions of 

ip,{i) 

ated by the same constant C. 

Further, we can always find a positive quantity U such that for all t in I 






Hence 


(7) 1 J„(0 I ^ (y,i ■ •. Cm'' f , (e'”^-" + e-'"*-") dPi ■ ■ • dF„ 

The rest of the proof is divided into two parts corresponding to the conditions 
of Theorem 1 and those of Theorem 2. 



WALD’S FUNDAMENTAL IDENTITY 


443 


When the conditions of Theorem 2 are fulfilled we make the transformation 
(6) in (7) with t = U and t = —to. Then 

1 J^{t) 1 g Cmr[P{n = m | G{U)) + P(n = m | (?(-/o))] g 2Cm'‘ < 

This justifies the differentiation of the series in (3). 

Substituting N for m and n > W forn = m in the above expression we further 
have 

1 JN{t) 1 g CN’‘[Pin > N 1 G(to)) + P(n> N\ G(-to))], 

and conclude from Lemma 2 with k = /x in (1) that tends to zero uniformly 
in t. It follows that the rth derivative of also tends to zero uniformly in t. 

In the second part of the proof we assume the conditions of Theorem 1 to be 
satisfied. We then write (7) in the following form 

(8) 1 Jroit) 1 ^ . <pr.r'm'‘P(n = + e-'"*-"), 

where En=m signifies the conditional expectatiop when it is known that n = m. 
From the definition of n it follows that, when n = m, we have bm-i < Zm-i < 
Om-i and Zm ^ a„ or ^ bm . Hence 

| ^ a„) = i Z„_i + s,. ^ a„] 

£ 1 > o„ - 

The second exponential can be treated in a similar way. Thus Jm(t) is majorated 
by a finite expression 

Finally, we substitute N for m and n > 2V for n = m in (8). / being a closed 
interval it follows from condition 3° in Theorem 1 that we can find a constant 
C such that 

1 JAt) 1 ^ CN^Pin > N)En>N{e‘'>"^ + 

From the definition of n and condition 2° in Theorem 1 we have b < Zy < a. 
An application of Lemma 2 then shows that Jnit) tends to zero uniformly in t. 
This proves the theorem. 

CoEOLLARY TO THEOREM 3. When the conditions stated in Corollary of Theorem 2 
are fulfilled for all t in the closed interval I, Theorem 3 is true. 

This is obvious. 

6. The fundamental identity for identically distributed variables. In the 
special case of identically distributed variables for which P(s = 0) < 1 and 
0 < <p{i) < 00 we infer from Theorem 1 that the fundamental identity 

(9) = 1 

holds if i is complex and ] ip{t) \ ^ 1. This is the case discussed in [1]. 

Further, when P{z = 0) < 1, the integrals f e^dF and f e'W cannot both 



444 


BROCKWAY MCMILLAN 


be zero for every a > 0 and /S < 0, and thus we infer from Theorem 2 that the 
fundamental identity holds for all real t (if the limits Oy and 6 y are chosen in 
accordance with the conditions of this theorem). This proposition is somewhat 
more general than that proved in (3] by a similar method. 

It also follows from the last remark and Theorem 3 that, when P(z 0) < 1, 
(9) can be differentiated any number of times for any real 1. This proposition 
contains the results in [2] and [3] as special cases. 

7. A generalization. We finally remark that the assumption made in Theorem 
8 that the expressions containing derivatives of ip,{t) are uniformly bounded is 
unnecessarily restrictive. For example, it seems possible to prove that the first 
derivative of (2) may be obtained by differentiation under the expectation 
sign if the series (cf. Corollary 1 to Theorem 7.4. in [6]) 

m-1 »-l 

is uniformly convergent with respect to 1. 

REFERENCES 

[11 A. Wald, "On oumulatlve sums of random varlableB," Annals of Math. Slat., Vol. 16 
(1944), p. 286. 

[2] A. Wald, "Differentiation under the expectation sign in the fundamental identity of 

sequential analysis," Annals of Math. 3tat„ Vol. 17 (1946), pp. 493-497. 

[3] Q. E. Albubt, "A note on the fundamental identity of sequential analysis," Annals of 

Math. Slat., Vol. 18 (1947), pp. 893-696 and Vol, 19 (1948), pp. 426-427. 

[4] C. Stein, "A note on cumulative sums,” Annul* of Math. Slal., Vol, 17 (1946), pp. 498- 

499. 

[6] H. CuAiiaB, Sur un nouveau theoiAme-Iimite de la thdorie des probabilitds, Actualilts, 
scienlifiqws el indusirielles, no. 736, Hermann et Cie., 1938, p. 6. 

[6] J. WoLrowiTa,,‘‘Tho eflScienoy of sequential estimates and Wald’s equation for se¬ 
quential processes," Annals of Math. Slat., Vol. IS (1947), pp. 228-229. 


SPREAD OF MINIMA OP LARCiE SAMPLES 

By Bhockway McMillan 
Bell Telephme Laboratories, Murray Hilt, iV. J. 

1. Theoiema. Let x have the continuous cumulative distribution function 
F{x). Let (xi, •■ - , xtf) be a sample of N independent values of x and y = 
inf (aji, • ■ • , zy). Then 2 /is a random variable with the cumulative distribution 
function 

( 1 ) OM = 1 - (1 - F(y)r. 

Let K values of the new variable y be drawn, {yi, ■ , Vk) and let the spread 

w = sup (yi, • • • , y*) - inf (yi, • • • , y*). 



SPREAD OF MINIMA 


445 


Fixing K, we consider the cumulative distribution function of w, Pii(w), as 
iV —> 00 . That is, we have K large samples of x and wish to examine the spread 
among their minima. It is evident intuitively that if F(x) = 0 for some finite x, 
these minima are bounded from below and will cluster near the vanishing point 
of F{x), making w —* 0 statistically bs N Our theorems also show that 

even when ^ — oo statistically, i.e., when F(x) — 0 for no finite x, the spread 

lu —> 0 statistically if the tall of F(x) is sufficiently small (e.g, Gaussian). On 
the other hand, if F(x) = 0(e**) as a; ^ — oo, the distribution Pif(w) does not 
peak as iV —> 00 , while for larger tails (e.g. algebraic) —>■ + oo statistically 
Two simple theorems are 

I. If 

Fix) , 

hm =f 7 —p—^ “ 1, 

1 —-XI F(x + s) 

then 

lim Pif{e) = 0. 

AT-* CO 

II. Let 8 > 0. 7/ 

F(xo) = 0/or some Xs > — «>,orif 


then 


lim 


Fix) 


-« Fix + 8) 


0 , 


lim Psis) = 1. 

JV-*oo 


Theorem I is directly applicable to distributions with algebraic tails, theorem II 
to Gaussian tails. We prove them both as corollaries of the more general results: 
III. If 


then 

IV. Let s > 0. If 

then 


lim inf = I 

F{x + s) 


lim sup Pifis) < (1 — l)^ 

AT-*® 


Fix) = Ofor no finite x and 
Fix) 


lim sup 


Fix + s) 


L, 


lim inf P„is) > [e"“‘ - e""]"" 


for any a > 0. 



446 


BROCKWAY McMILIiAN 


Theorems III and IV together show that an exponential tail (F(a:) = 0(e'‘*)) 
leads to a Pn{w) which, asymptotically, is bounded away from 0 for any w > 0 
emd bounded away from 1 for w sufficiently small. 

2. Proofs. Explicitly, for any s >: 0, 

(2) P^(,8) = K r [0^{x + 8) - dO^ix + s). 


Turning now to III: given s > 0, choose Xi = a:i(e) so that (i) F(xi) ^ 0, and 
(ii), X < Xi implies 


( 3 ) 


m 

Fix + 8) 


> I - 


f. 


We then rewrite (2) as 


( 4 ) 




Treating Gnix + s)* as the independent variable, the first integral may be 
evaluated by the mean value theorem in the form 


( 5 ) 



Gyjx,) r* 

0MiX2 d- 8)_ i-M 


dO,,(x + 8 )^ 


< 


Gz/ixn + s)_ 


with an appropriate Zi = Xi(N), — oo < xt < Xi. 

Using the form (2) of the integrand in the second term of (4), we may bound 
the latter by 

(6) K f dGffix + 8) < iCtl - G^ixi + s)], 

•'ll 

since 


G„ix + s) - G„ix) < 1. 

Now, by factoring (1), 

Gi^ix) _ Fix) 1 + Q + • • - + Fix) 

^ ’ Oi^iz + s) Fix + 8)1 + Q. + + or' - Fix + 8) 

where Q = 1 — Fix), Q, = 1 — Fix + s) ;< Q. Combining (3), (4), (5), (6), 
and (7), 


p^(s) < [1 - i + K:[1 - G,,(Xi + s)]. 


Since Fixi + s) > Fixi) > 0, we have 

lim Gjv(a:i + 8) = !■ 

;V-*ao 


lim sup Pxis) < [1 — Z + e]* ^ 

JV-»oo 


Hence, 



SPREAD OF MINIMA 


447 


and III follows by letting e 0. Then I follows immediately with 1 = 1, when 
we note that Pjv(s) > 0. 

To prove IV, choose any a > 0. By hypothesis, for sufficiently large N we 
may always find xn = Xn(a) such that 

(8) F(x„) = 


By hypothesis, and the monotonicity of F(x), Xy — oo as V —>• <». For any 
e > 0, therefore, we can find No = No(oi, «) such that N > No implies 


( 9 ) 


F{xk) ^ L 
F(xji d- a) ~ 1 — £ 


a 

or F(xk + s) > ^(1 — €). Directly from (2), since s > 0, 

Pn{s) > K / [G:,ix +s)- dGjfix + s) 

JxN—a 

> if / [GAr(a: + s) — Gsixv^ dGitix + s). 

Jx/f—a 

But this last integral is of the form 

f K(U - Gf-^ dU = (U - Qf, 

whence 

P!r(s) > + s) ~ 

or 

(10) Pi,{s) > [(1 - F{x„)f - (1 - F{x^ + s))'^]'. 

By (8) and (9), therefore 

Since this holds for all N > Na(.a, «), 

lim inf Pw(s) > [e““^ — 

N-*9Q 

This last, m turn, now holds for any € > 0, hence 

hm inf Pj^(s) > [e~“^ — e~“]* 

T his now holds for any a > 0 Maximizing on a yields a sharper bound than the 
result of IV. The applicable part of II follows, when L = 0, by letting a —> = 0 . 
That the conclusion of II holds when F(xo) = 0 for some finite xo follows from 
(10) with Xk replaced by some X\ such that F{x{) = 0, F{xi -)- s) > 0. 



448 


EDOAB REICH 


ON THE CONVERGENCE OF THE CLASSICAL ITERATIVE METHOD 
OF SOLVING LINEAR SIMULTANEOUS EQUATIONS* 

Bt Epqab Ebioh 
MaBBochxiaelU Inatitute of Technology 

The classical iterative method, or Seidel method, is a scheme for solving the 
system of linear algebraic equations 

fl 

hi, it 1, 2, * * ■, ?i), 

/-I 

by successive approximation, as follows: 

If “= (xi*', xi'', • ■ ■ , x»’) is the vth approximation of the solution, the 
(v + l)st approximation, , a:!'"*'”), is obtained from 

the relations 

(AuXi^^^ 4* AijXi'^ + Ai»x»'^ "!’**•+ AinX^'' => hi, 

AnXi’’*’*^ + + A»X|'' + ' • • + ■4 jbXb’^^ = hi, 

■ AuXi’’*'*' + + AsaXi’’^*' + ’ ’ • + ■dinXn’’^ ■“ ha , 


UMxi'+“ + AbiI^” + A„,x5'+» + • • • + « bn , 

xi''*’*' being obtained from the first equation, then xi"'''*’ from the second, and 
so on. 

The given system can be written in matrix notation as .4x = h where A is 
a non-singular square matrix of order n, and x and h are column vectors of order n. 
Let us define square matrices Ai and Aa as follows; 


(Ai)if 


Aij if i > j 
P if i < j 




A.v if i <j 
P if i ^ J 


(Note that Ai -t- Ai “ A.). 

With this notation the Seidel method can be written as the matric difference 
equation 

Aix‘’+'’ + Aix‘'’ = h. 

Now various writers, among them C. E. Berry in this journal, (See list of refer- 


* Work done under Office o£ Naval Research Contract NSoriSO. 



CONVERGENCE OP ITERATIVE METHOD 


449 


ences at end of this paper.) have shown that a necessary and sufficient condition 
for convergence, i.e., a necessary and sufficient condition for 

liin — T.) = 0, (z = 1, 2, •••, n), 

is that 

(1) Ai has an inverse; that is Ai, 0 for any i. 

(2) The characteristic roots of (AT^Ai) all have an absolute value smaller 
than unity. 

It would be advantageous to rephrase the above condition, if possible, in terms 
of simpler requirements on A. As a step in this direction the following theorem 
IS offered: 

Theorem. If A is a real, symmetric nth-order matrix with all terms on its main 
diagonal positive, then a necessary and sufficient condition for all the n characteristic 
roots of (A7^A2) to be smaller than unity in magnitude is that A is positive 
definite. 

Proof. Let be a characteristic vector of (Ar^Aj) corresponding to the 
characteristic root jn,. Then 

(1) (47*As) 3, = y.,Zj . 

Premultiplying by l(Ai, where the apostrophe and bar denote transposition 
and conjugation respectively: 

(2) ZtAiZ] — fijZiAiZ ). 

Consider the bilinear form z',Azj. 

We have 

(3) z'iAzj = z[AiZj + z[AiZj = (1 + li,) z[Aiz ,. 

Interchanging i and f : 

(4) ZjAz, = (1 fZf^ZjAiz ,. 

Taking the conjugate: 

(5) 2;A?i = z'Azj = (1 + mO^'jAiz, = (1 + /i,)z',Aizj. 

Let D be the diagonal matrix with elements 

(6) “ AtjSij . 

This makes A( = D + A 2 . 

Substituting this in (5): 

(7) ZiAzj = (1 + fit) (ztDz, + ZiAsZj) = (1 + fif)z[Dz, + (1 + fi^iifitAiz ,. 
Eliminating s.Uiz, between relations (3) and (7) we obtain 

(8) (1 ~ fitP-fjz'tAz] = (1 + Ml) (1 + fij)^tDz ,. 



460 


KDOAH KKICII 


To obtain the necessary condition we use the fact that we must have | /!,■ | < 
and can therefore rewrite (8) as 

(9) S{Azj = = X!) (1 + Mi)j5i(l 4- n,)n)l(Dzj. 

m 

If a: = S CiSi is any linear combination of the m <n independent characteristic 

t-i 

vectors of (Ar^Aj) then 

m \ i« 

^ Ci 2i j ^ ^ j ia’ Cj 5 j AZj 
t-l / i.J-1 

= ^ S (1 + + Hj)li)ziD3j, 

i,/~l r-o 

or 

CO 

S'Aar = X) Vk^l/i- 

»-o 

where 

m 

y* “ X C<(1 + fidn'l Zi. 

t-i 

Since by hypothesis Au > 0, D is evidently positive definite, and therefore 

(11) J5'A.t > 0. 

In case the characteristic roots m, (i = 1,2, • • ■ n), are all distinct there will be n 
independent,sj assured, and in that case (11) implies that A is positive definite 
Consider, on the other hand, the case where the m are not all distinct. Note 
that (a) the definiteness properties of a matrix are not changed by sufficiently 
small alterations in the elements; (b) the /I's depend continuously on the elements 
of A; (c) the discriminant of (1) is a polynomial in the A,-/ that does not vanish 
identically.* It follows that A must be positive definite even in the case of re¬ 
pealed roots because an arbitrarily small change in A 11411 separate any multiple 
/i’s, stm keeping them smaller than unity in magnitude, and not changing the 
definiteness properties of A. 

This completes the proof that the condition given in the statement of the 
theorem is necessary, Now to prove sufficiency: 

Setting i => j in relation (8) we obtain 

(12) (1 - I hH = I 1 -f I ^slDz, 

Since both A and D are positive definite 

(13) z'iAzi > 0 and ZiDzi > 0. 

’ The fact that the disonminant is not identically zero follows from easily constriictible 
counter-examples. 




BECUBKBNCa EOHMTJLAE 


451 


Moreover, we cannot have pi, = — 1 because that would mean by (3) that 
0 = ^iAlZ^ + z[A^t = ziAz,. 

Relation (12) thus implies 

(14) 1 - I M. I' > 0 

i.e. 1 jiii 1 < 1 as was to be proved. 

The part of the theorem giving the sufficient condition was already obtained 
by L. Seidel [1] and G. Temple in a somewhat more indirect fashion. 

REFERENCES 

[1] L, Seidel, "Uber ein Verfahren die Gleichungen, auf welohe die Methode der kleinsten 

Quadrate fuhit, Bowie Imeare Gleichungen uberhaupfc, durch successive Anna 
herung aufzulosen,” Abhandlungen der Malhemaiisch-Physikalischen Classe der 
Kdmghch Bayerischen Akademie der Wtssenschafien, Vol. 11 (1874), pp 81-108, 

[2] C E Bebby, “A criterion of convergence for the classical iterative method of solving 

linear simultaneous equations,” AyinaZa of Math. Siai , Vol 16 (1946), pp 398-400. 

[3] L Cebabi, "Sulla risoluziono dei sistomi di cquazioni lineari per approssimazioni suc¬ 

cessive,” Rassegna dalle Paste, der Telegrafi e der Telefoni, Anno 9 (1931) 

[4] L. Cesaei, “Sulla nsoluzione dei siatemi di equazioni lineari per approssimazioni suc¬ 

cessive,” Beale Accademia Nazionale der Lincer, Serie 6, Classe dr Szienze fisicke, 
matemaliche e naturah, Rendicontr, Vol 26 (1937), pp. 422-428 
[6] J Mobbis, The Escalator Method rn Engineerrng Vrbration Problems, Chapman and Hall 
Ltd., London 1947, pp. 63-70 

[6] R J Schmidt, "On the numerical solution of linear simultaneous equations by an 
iterative method,” Phil. Mag , Ser 7, Vol. 32 (1941), pp. 369-S83, 


SOME RECURRENCE FORMULAE IN THE INCOMPLETE BETA 
FUNCTION RATIO 

By T. a. Bancboft 

Alabama Polytechnic Institute 

1. Introduction. It is well known that the incomplete beta function ratio. 


defined by 


(1) 

r fi) 

where 


(2) 

Bxip, g) = [ - x)' 

Jo 

and 


(3) 

S(p, g) = Blip, g), 



452 


T. h. BANCnuFT 


is of importance in probability distriljution theory, and, hence, also in obtaining 
exact probability values in making tests of statistical hypotheses. In constructing 
certain extensions [I] of Karl Pearson’s “Tables of the Incomplete Beta-Func¬ 
tion” [2], the recurrence formulae contained in the following sections were de¬ 
rived. 

2. Derivation of formulae. The incomplete beta fune,tion, Bxip, q) may be 
considered as a special case of the hypergeomctric series, F(a, h, c, x), thus 

(4) Bxip, 3 ) = ^ ^ip> 1 - g, P + 1, .t). 

P 

The series converges for | s | < 1, if and only if a -f b < c. By setting a = p, 
b = 1 — g, and c = p -f- 1, as in (4), all conditions are satisfied, if we also take 
g > 0. 

Recurrence formulae for F(a, b, c, x), 0 . g., in the work of Magnus and Obor- 
hettinger [3], may now be directly converted for use with Bx(p, g) or Ix(p, g). 
In particular, using the three identities on page 9 of [3J, with x replacing z, we 
have 

(5) cF(a, b, c, x) + (b ~ c)F(a + 1, b, c + 1, x) 

— b(l — x)F(a -f 1, b + 1, c + 1, a:) = 0, 

(6) o(c — oa: — b)F(a, b, c, x) — c(o — b)F(a, b — 1, c, x) 

-h abx(l — x)F(a 1, b -f 1, c -f- 1, x) “ 0, 

(7) cF(o, b, c, x) — cF{a, b -f 1, c, x) + axFia -f- 1, b -f- 1, c -f- 1, x) = 0, 

with a = p,b ~ I — q, and c = p -f- 1, we obtain in turn 

(8) xixip, q) - Ixip -f 1, g) + (1 - x)7x(p -f 1, g - 1) = 0 

(9) (p -b g - px)Ix(p, g) - gZx(p, g -t- 1) - p(l - x)r,(p -f 1, g - 1) = 0 

(10) gJx(p, g + 1) + p7x(p + 1, g) - (p -b q)hip, g) = 0. 

Formula (8) is the basic recurrence formula used in the construction of Karl 
Pearson’s [2] tables. Formula (10) was obtained, incidentally, by the author [4] 
in a different connection and manner. 

Formulae (8), (9), and (10) may now be combined to give other useful formulae, 
e. g., 

(11) qixip -b 1, g + 1) + (p + ga: - q)Ixip -b 1, g) - (p + q) = 0. 

( 12 ) pixiv + 1, g -b 1) + (g — p -b qx)Ixip, g -b 1) 

- (p -b g)(l - X)lx (p, q) = 0, 



RECURRENCE FORMULAE 


453 


(13) (p + q - - 1, q) 


- (p + 5 - la: + p)L(p, q) + ph{p + 1, g) = 0, 

(14) ip + g)(l - a-)Zx(p + 1, g - 1) 

- {(p + g)(l - a;) 4- q]h(p + 1, g) + phip + 1, g + 1) = 0. 

Notice that the sum of the coefficients is always zero 

By a repeated use of (10) it is possible to obtain the formulae 

1 


(15) 


(16) 


hip, q + n) = 


■ (r) (P + 5 p - + p - D'^’LCp, O' + r), 

1 ^ C IV 

(g + n - 1)W ^ ^ 

■{f)iP + q + n- l)<”"^'(p + r - 1)^'’J.(p + r, q), 
where ip + q + n — 1)'”“''', etc., refer to the factorial notation, e. g., 

[p + 3 4" (?i ~ 1)]^" = (p 4 9 d" ^ ~ l)(p 494‘R — 2 )-''(p 4’9 4'r). 

3. An application. Formulae (15) and (16) may be used to write general 
formulae for obtaining values of hip, q) where p or 3 may be greater than 50, 
i. e., for such values outside the range of Karl Pearson's tables. In particular, 

r 


7.(50 + n,q) = [n+q + 49)'">7.(50, g) 


(17) 


and 


(18) 


-(l)3(« + 3 + 49)'"-'>7.(50,g+l) ••• (-l)"(g4 R - 1)'">7.(50, g + n)] 

hip, 50 + R = [(R + P + 49)'">7.(p, 50) 

-(j)p(n4- P + 49)'"-'’7.(p4-1,50) ■ • • (-l)"(p + n- l)'">7.(p + r,50)]. 

It should be noted for (17) that as n increases the range of values that can be 
obtained outside Karl Pearson’s tables are reduced since the last term of (17) 
contains 7.(50, g 4- r). A similar observation is noted for (18). From a practical 
standpoint the computational labor restricts n to fairly small values. Using (17) 
we may easily compute for example, 

7 6o(52, 48) = 7 6o(50 4 2, 48) 


= —I— [(99)(98)7 6o(50, 48) - 2(99)(48)7 6o(50, 49) 4 (49) (48)7 6o(50, 50)]. 
(51)(50) 



454 


T. A. BANCnOFT 


Substituting the necessaxy values from Karl Pearson’s tables we calculate 

7.60 (52, 48) - .94B6248, 

Similarly using (18) we may calculate 

7.«,(48, 52) * .0534752. 

As a check on the computations, we use the well-known identity 

■7*(p, ff) = 1 - 7i-*(p', q'), 
where p' = q and q' = p. Then 

7.40(48, 52) = 1 - 7.m(52, 48) 

= 1 - .9465248 


= .0534752. 


In like manner formulae (15) and (16) may be used to write general formulae 
for obtaining half values for p or q greater than 10.5, i. o., for values not in¬ 
cluded in Karl Pearson’s tables. In particular, 


ag) 


7.(10.5 + n, q) = [(9.5 -H 5 + n)<'’>7.(10.5, q) - 

■g(9.5 -1- q n) '"-"/.(lO.S, g -f 1) • • • (-l)’‘(g + n - l)‘’‘^7.(10.6,g + n)], 


and 


( 20 ) 


Lip, 10.5 -f n) = [(9.5 + p + 7i)‘’’’7.(p, 10.6) - 

. p(9.5 -f p + n)^’'-‘>7,(p -H1,10.5) ■ • • {-iTip + n - l)‘"^7.(p + n, 10.5)j. 


Using (19) we may compute 


7.m( 12.5,8) = .^jj^,[(19.6)®7.,o(10.5,8)-2(8)(19.5)J.„(10.5,9) 

+ (9) (8)7,60(10.5,10)], 

Similarly using (20) we obtain 


.4612367. 


7.m( 8, 12.5) = .5487633. 
Employing the check formula, 

7.4o(8, 12.5) = 1 - 7.66(12.5, 8) 


= 1 - .4612367 
= .5487633. 



A THEOBEM BY WALD AND AVOLFOWITZ 


455 


Thanks are due to Dr. J. C. P. Miller, Technical Director, Scientific Com¬ 
puting Service, Limited, London, England, for helpful suggestions in the prepara¬ 
tion of this paper. 

EEFERENCES 

[1] T, A, Bancroft, “Some extenaiona of the incomplete beta function tables ” (in prepara¬ 

tion) 

[2] Karl Pearson, Tables of the Incomplete Beta-Function, Cambridge tlniversity Press, 

1934. 

[3] Wilhelm Magnus und Fritz Oberhettinger, Formeln and Sdtze fur die Speziellen 

Funktionen der Mathematisehen Physik, Julius Springer, Berlin, 1943, 

[4J T A Bancroft, "On biases in estimation clue to the use of preliminary tests of signifi¬ 
cance, Annals of Math. Stal Vol 15 (1944). 


ON A THEOREM BY WALD AND WOLFOWITZ 


By Gottfried E. Noether 
New York XJrmersity 

Let §n = (Ai, ' • • , /in), (n = 1, 2, • • •)> be sequences of real numbers and for 
all n denote by the symmetrical function generated by hV • • • Kl*, 

i e , H,^ = S /ij} • • • h\Z where the summation is extended over the n(n — 1) 

• • • (n — OT + 1) possible arrangements of the m integers ti, • ■ ,im , such that 
1 < t; < n and t, 5^ , 0, ^ = 1> • • • > "i)* According to Wald and Wolfowitz 

[1] the sequences are said to satisfy condition W, if for all integral r > 2 


- E (/i. - ly 

n ,-i 



= 0 ( 1 ),^ 


where H = 1/n S ,-i hi. 

Given sequences 2l„ = (ai, ■ • • , a„) and jD» = (di, ■ • ■ , d„), consider the 
chance variable 


L„ = diXi, -H • • • 4- dnX„ , 

where the domain of (xi, • • • , a:„) consists of the n! equally likely permutations 
of the elements of Sin . Then it is shoAvn in [1] that if the sequences ?l„ and SDn 
satisfy condition W, the distribution of L* = (L„ — ELn)/<T(Ln) approaches the 
normal distribution with mean 0 and variance 1 as a —> <». These conditions 


‘ The symbol 0, as well as the symbols o and ‘-"to be used later, have their usual meaning. 
See e g Cram4r [2, p. 122] 



456 


nOTTFIUhD K, NOETHEli 


for asymptotic normality can lie weakened. It will l>e 8llo^\■Il that the following 
theorem holds: 

THEOni-M. /j“ is asympiolically normal with mean 0 and variance 1 provided the 
sequences 3)„ satisfy condition W n'hilc for the sequences §{« 


(1) 


£ (fli ~ aY 

|W] 



- ^(1), 


(r <= 3, 4, • - ■). 


WenotethatZ/fl i—' , ’■( -.Tf‘srcplaeed by [l/ziS ili (a, — - d) 

and d, by [\/n S ~ d). Therefore it is sufficient to prove 

asymptotic normality provided 


(2) Dj = 0, A = n, Dr “ 0{n), (r = 3,4, 

(3) ..'ll = 0, .4! “ 71, A, ■=■ o(n'^‘^), (r = 3,4, • ■ 

Then 

ELn * DiExi = 0, 

var L„ - ETJn = D^Ex] + DnExiXt 

1) 

and it is sufficient to show tlmf. n~''^BL'n tends to the rth moment of a normal 
distribution with mean 0 and variance 1. 

Now we can write 


= n^'^EL: = 2 ••• Z Ed<,x„ 

ii~i 1,-1 

(4) = ID, Dai + ■ • ■ + c{r, a, , ejD... Ex? 

+ ‘ • ‘ + Di...iExi • • • Xr] 


where ei + • ■ • + e„, = r with ed, {k = 1 , - • • , m), positive integral and the 
coefficient c(r, Cj , • ■ ■ , e„,) stands for the number of ways in which the r indices 
fi, • • ■ , 1, can be tied in m groups of size ci, • • • , fim , respectively, so as to 
produce the terms of Dtj...e„Ex? ■ • ■ Xm”. 

Since Ex? ■ • • ~ n~’"A wo have 


(6) . Ex? ■■■x‘„"'^ 7r‘''’+"”D... 

Lemma. B(r, Cj , • ■ ■ , e„,) ~ 0 unless 






E(r, Cl , • • • , ej, say. 


(6) m = r/2, Ci = • • • = e,/2 = 2. 

In that case B(r, 2, ■ • , 2) 1. 

Before proving this lemma we shall show that our theorem follows immedi¬ 
ately By (4) Hr is the sum of a finite number of expressions B(r, ci, • ■ • , Cm)- 



A THEOREM BY WALD AND WOLFOWITZ 


457 


Therefore if r = 2s + 1, (s = 1, 2 , • •)) ^ 2 ,+! 0, since at least one of the Bk , 

(/c = 1 , ■ • , m), in all the -B( 2 s + 1 , ei, ■ • , Sm) adding up to ^ 123+1 must be odd. 
If r = 2s, ~ c( 2 s, 2, • ■ , 2). Since the first index in (4) can be tied with any¬ 

one of the other 2 s — 1 indices, the next free index ivith any one of the remaining 
2s — 3 indices, etc., it is seen that /ij, ~ ( 2 s — l)(2s — 3) • 3. However these 
are the moments of a normal distribution w'ith mean 0 and variance 1 This 
proves the theorem. 

Proof op Lemma. Define A(ji, ,jh) = A,, ■ ■ ■ A,,, Then A^ is the 
sum of a finite number of expressions A(ji, • • , j;,), where the , {g = I, ■ ■ ■ , h), 
are obtained from ei, ■ , Cm by addition in such a way that 

(7) 3 i+ ■]- Jh = ei+ Cm = r 

Since by (3) Ai = 0, we need only consider those Aiji, ■ ,jh) for which 
ja > 2, (<7 = 1, • • , h) If some jiff > 2 by (3) and (7) 

(8) A{ji, ■ ■■ ,jh) = 

lijo ^ 2 , 

(9) 21(2, ••• ,2) = 

This last case can only happen if r is even and e*, (fc = 1, • • • , m), equals either 
1 or 2. Therefore, unless (6) is true 

( 10 ) m > r/ 2 . 

Similarly, writing D«i., as a sum of products of the kind D,, • • • it is 
seen that by (2) 


( 11 ) 


|0(n'") if m < r/2 

[ 0 ( 71 ’’'^) if m > r/2. 


Thus by (8)-(ll) 

(12) = o(n’-'^+"’), 


unless (6) is true. In that case 

(13) A 2 i 

(14) D 2 2 = n‘'- 


(12)-(14) together with (5) prove the lemma 

Let fli, a 2 , • • • be independent observations on the same chance variable Y. 
We may ask what conditions have to be imposed on the distribution of Y to 
insure—at least with probability 1—-that condition (1) is satisfied. Wald and 
Wolfowitz state in Corollary 2 of [1] that provided Y has positive variance and 
finite moments of all orders the ai, a 2 , satisfy condition W with probability 
1 and therefore insure asymptotic normality of provided the sequences 
satisfy condition W. On the other hand, it can be shown that the ai, 02 , • • 



468 


Z. W, BIllNHAUM AND F. C, ANDREWS 


satisfy condition (1) with probability 1, provided F has positive variance and 
a finite absolute moment of order 3. Thus condition (1) constitutes a considerable 
improvement over condition W. 


REPEB.ENCES 

[1] A. Wald and J. Woltowitz, "Statistical testa based on permutations of the observa¬ 

tions," Annals of Malh. Slat., Vol. 16 (1044), pp. 368-872. 

[2] H. CRAMfin, Mathemalical Methods of Statistics, Princeton, 1946. 


ON SUMS OF SYMMETRICALLY TRUNCATED NORMAL RANDOM 

VARIABLES 


By Z. "Vy. Birnhaum and F. C. Andrews^ 
University of Washington, Seattle 


( 1 . 1 ) 


m = 


1. Introduction. Let Xa be the random variable with the probability density 

for I a; I < a 
0 for 1 X I > a, 

obtained from the normal probability density by symmetrical trunca¬ 

tion at the "terminus” | x | = a, and let Si"' be the sum of m independent sample- 
values of Xa . We consider the following problem: An integer m> 2 and the real 
numbers A > 0, c > 0 are given; how does one have to choose the terminus o 
so that the probability of | iSi"’ | > A is equal to e, 

a.2) P(| I > A) = e? 

This problem arises for example when single components of a product are 
manufactured under statistical quality control, so that each component has the 

length Z = k -\- X where X has the probability density and the final 

product consists of m components so that its total length S is the sum of the 
lengths of the components. We wish to have probability 1 — e that S differa 
from mk by not more than a given A. To achieve this we decide to reject each 
single component for which \ Z — k \ = \ X \ > a-, how do we deternoine a? 

The exact solution of this problem would require laborious computations.® 
In the present paper methods are given for obtaining approximate values of a 
which are "safe”, that is such that 


(1.3) 


F(| Sr I > A) < €. 


^ Research done under the sponsorship of tho Office of Naval Research 
‘ A similar problem has been studied by V J. Francis [2] for one-sided truncation, he 
actually had the exact piobabilities for the solution of his problem computed and tabulated 
for jn = 2, 4. 



TRUNCATED RANDOM VARIABLES 


459 


In deriving these safe values, use will be made of theorems on random variables 
with comparable peakedness, for which the reader is referred to a previous 
paper [1], 


2. The safe value oi. For fixed a > 0, we consider the normal random variable 
Ya with expectation 0 and with probability density gdYa) such that gfa(0) = /a(0). 
It is easily seen that Y^ has the standard deviation 


( 2 . 1 ) 


1 f+“ 

~ L 




dl, 


and that < fa{^) for | f j < a, > 0 = /„(?) for | f | > a. Hence, applying 
Theorem 1 in [1], we conclude that 


( 2 . 2 ) 




-(»/2 


(A/faVm) 


dt. 


If m. A, and e are given, we determine f, from tables of the normal probability 


integral so that 
equation 

(2.3) 


2 __ 

\/2t J(, 


-( 2/2 


dt = «, set ffo = 


fcVm 


in (2.1), and solve the 


A 


1 /•+“ 
i-a 


f e-*''^dl 


for 0 using again tables of the normal probability integral. In view of (2.2) this 
solution satisfies (1.3) and hence is safe; it will be denoted by oi. 


3. The safe value. Ua 
inequality 

p(i sr’ I > A) 

(3.1) 

< 


A direct application of Theorem 2 in [1] yields the 


1 


2^ ^7r! J(in+.4/o)< jAn 




for 0 < A < ma. Hence by equating h„i,A/a) to « and solving for a, we obtain a 
safe value which will be denoted by a^. It is of interest to note that (3.1) is true 
not only for /a(a:) defined by (1.1) i.e. truncated normal, but /or any prohability 
density fa{x) which is symmetrical and unimodal, since these are the only assump¬ 
tions needed for Theorem 2 in [1], 


4. Solution for large m. The random variable Xa has the variance 


(4.1) 


AX^j 


24 >"{a) 
2<p{a) — 1 


where 





460 


Z. W. BinNBATIM AND P. C. ANDKBWS 


Hence, according to tlie central limit theorem, we have the approximate equality 


(4.2) 




-(>/2 




dl 


for m suiBciently large. 

It can he reasonably expected that the cumulative distribution of differs 
from its limiting normal probability integral by less than the cumulative distri¬ 
bution of the sum C/n’"’ of m independent uniform variables in (—a, -fa) differs 
from its limiting normal probability integral. Already for m = 4 the cumulative 
distribution of differs from the corresponding normal cumulative by less 
than .0075. Equally good or better approximation may, therefore, be expected 
for the distribution of so that the error in the approximate equality (4,2) 
between the tivo-tail probabilities should be less than .015 for m = 4, and still 
less for m > 4, 

Equating the right-hand term of (4.2) to « and solving for we obtain 


a\Xa) = 1 + 


24>"{a) 

24.(a) - 1 



an equation which can be solved for a with the aid of tables of d>{x) and 
We denote this value of a by ai. 


6. Use of the different solutions in practice. From tlio. foregoing it appears 
that the following procedure may be followed in solving our problem in any 
definite case: 

If m is large, m is very close to the exact solution of (1.3) and may be used 
safely. 

If m is not large but m > 5, it is conjectured that ai is such that the left-hand 
term in (1.3), for a — aj, differs from e by less than 0.015. 

If m < 4, the larger of aj and ai should be used. Table I contains the A for 
which O] and 02 have the same value, say o'; oi or 02 should be used if the given A 
is greater or smaller, respectively, than the tabulated value. The value ai is 
easily computed from a table of the normal probability integral by the procedure 
of section 2 The value 02 can be obtained by reading off A/ot from Table II. 


TABLE I TABLE 11 


Values 

: of A foi 

■ which 

ai ai 

“ 0 ' for given m 

1 « 

Values of A/ci 

for given m, t 

\ 

\m 

.\ 

\ 

2 

A 

a‘ I 

3 

A 

o' 

■ 

a' 

■\ 

2 

3 

4 


4.668 

2 357 

5.446 

2,008 

6.162 

1 842 

,001 

1.937 

2,712 

3 339 

mSm 

4.258 

2 228 

5 059 

1,918 

5,717 

1,779 

,002 

1 911 

2,637 

3.213 

■EB 

3 808 

2.047 

4.612 

1,799 

5.111 

1.097 

.005 

1,859 

2 607 

3 on 

01 

3.438 

1 010 

4.074 

1.712 

4.632 

1.640 

.01 

1.800 

2.379 

2.824 

.02 

3,034 

1.766 

3 614 

1.030 

4.131 

1.589 

.02 

1.71S 

2.217 

2.600 


2 466 

1.581 

2.970 

1.533 

3.426 

1 529 

.06 

1 565 

1.937 

2.240 





A CUMULATIVE FUNCTION 


461 


6. Examples. 1) A = 3.8, m = 4, e = .05 Since A is greater than the value 
3.425 , in Table I, we compute ai = 2.162. From Table II we would obtain 
A/a^ = 2.240 and thus an = 1.696 < ai. 2) A = 3, m = 4, e = .02, Since A < 4 131, 
we read A/a 2 = 2.600 from Table II and obtain os = 1.153 which will be greater 
than oi. 3) A = 5, TO = 30, e = .05. Using the method of section 4 we obtain 

ai = 1 62. 


REFERENCES 

[1] Z W Birnbaiim, “On random variables with comparable peakedness,” Annals of 

Math Siai., Vol 19 (1948), pp 76-81 

[2] V J Fbancis, “On the distribution of the sum of n sample values drawn from a trun- 

oated normal population,” fJo?/ Slal.Soc Jour Suppl ,Vol 8 (1946),pp 223-232 


A CERTAIN CUMULATIVE PROBABILITY FUNCTION 
By Sister Mary Agnes Hatice, O.S.F. 

St. Francis College, Ft. Wayne, Indiana 

Graduations of empirically observed distributions show that the cumulative 
probability function F(a:) = 1 — (1 + is a practical tool for fitting a 

smooth curve to observed data. The graduations are comparable with those 
obtained by the Pearson system, Charlier, and others and are accomplished 
with simple calculations. Given distributions are graduated by the method of 
moments Theoretical frequencies are obtained by evaluation of consecutive 
values of F{z) by use of calculating machines and logarithms, and by differencing 
NF(x). No integration nor heavy interpolation is involved, such as may be 
required in graduation by a classical frequency function Burr [1] constructed 
tables of n ,a, as, and values for the function F{x) for certain combinations 
of integral values of 1/c and l//c. In these tables curvilinear interpolation must 
be used in finding an F{x) with desired moments. The writer constructed more 
extensive tables for the same cumulative function with c and k a variety of 
real positive numbers less than or equal to one, such that linear interpolation 
can be used to determine the parameters c and k for an F{x) that has as and 
at approximately the same as those of the distribution to be graduated. These 
tables have been deposited with Brown University. Microfilm or photostat copies 
may be obtained upon request to the Brown University Library. 

The writer used the definitions of cumulative momenta and the formulas 
for the ordinary moments vi , <r, as , and 0:4 in terms of cumulative moments 
as developed by Burr. These latter moments were tabulated for the function F(x) 
having various combinations of parameters c and k, c ranging from 0.050 to 0 675 
and k from 0 050 to 1.000, each at intervals of 0.025 Within these ranges only 
those combinations of c and k were used which yielded as of approximately 1 or 
less and at values of 6 or less, since such moments are most common in practice. 

It can be verified that over most of the area of the table aa values obtained 



462 


SISTER MARY AGNES HATKE 


by linear and by curvilinear interpolation on k (or on c) differ by less than 0.001 
and values of by approximately O.Ol or less. If ai » constant and cn =* constant 
curves are plotted on c, k axes, it will be seen that there exists only one solution 
(o, k) of the equations aa => J?(c, k) and a* « C{c, k). Furthermore, some m 
curves intersect two oiz curves representing the same | aj |. Thus the chance of 
finding an appropriate function f\x) for graduation is increa.sed since by reversal 
of scale an F{x) with a positive as may be used to graduate a distribution with a 
negative aa, and conversely. 

Graduation of an observed frequency distribution is easily accomplished. 
Linear interpolation on k for a fixed c seems to be the best method for determining 



Fig. 1. The a], S chart for the Pearson system of frequency curves and the area covered 
by/(x) = 1 ~ (1 + (suhaorlpt L =■ boll-Bliopcd) 

the parameters of an F(x) that has aa exactly the same and oi nearly the same as 
the observed aa and aa. If the observed aa and oi are fairly close to an entry 
in the table, no interpolation is required. Direct linear interpolation is used to 
determine Pi and <r for the c and k just found. Letting M and S be the mean and 
standard deviation of the given distribution, the formula, 

<r ~ ^ 

is used to translate the class limits X of the given distribution to the correspond¬ 
ing a;’s of F(x). For any x that is negative the quantity 1 + x^^‘ is taken as one 







A CUMULATIVE FUNCTION 


463 


to make F{ — x) = 0 in accordance with the definition of F{x) [1]. The values of 
(1 + for the various x’a are computed by logarithms and differenced to 

obtain the probabilities for the ^iven class intervals, according to equation 


P{a < X ^ b) = f fix) dx =Fib) - Fia). 

Ja 


The respective theoretical frequencies are these probabilities multiplied by N, 
the number of cases 

The headings that proved satisfactory for the coluijans of the graduation 
work-sheet are: class intervals (in observed physical units), X (u H unit class- 
interval is used), /„b. , X, 1 + x^'\ N/il + x‘°f"‘, and . 

The relation of Fix) to the Pearson system of frequency curves is presented in 
Figure 1 , which is a reproduction of a major part of Craig’s chart for aj and 
5 [2]. Iq this chart the parameters of the twelve Pearson curves are expressed in 
terms of as and 5, where 5 = {2 on — 3as — 6 )/(a 4 -|- 3). Values of aa and 5 
were computed for Fix) = 1 — (1 -|- lu which c and fc were assigned 

the values listed in the on , table. The dotted area superimposed on the Craig 
chart is that covered by these aa , B values for Fix) Although it is small m size 
compared to the total area, it contains a part of the areas representing the three 
main Pearson curves, I, IV, and VI, as well as the point for the normal curve 
and part of the line on which lie the points corresponding to the bell-shaped 
curves of the Type III functions. It also includes transitional Types V and VII. 
Thus the function F{x) covers part of an important area on the at , S chart for 
the Pearson curves. 

The function Fix) was used to graduate satisfactorily several observed dis¬ 
tributions classified aa Pearson types, including the three mam Types, I, IV, and 
VI, and transitional Types III and VII. 

One advantage in the use of this cumulative function Fix) is that it takes but 
one symbolic form with the area covered, whereas the Pearson-system curves 
require several different expressions of various complexity requiring identification 
of type Furthermore, graduation by 'a Pearson function generally involves 
approximate integration or heavy interpolation in the incomplete beta function 
tables for the evaluation of the integrals of the Pearson functions, whereas 
graduation by a function Fix) is easily and quickly performed since Fix) only 
involves two number-parameters readily determined by means of the aa, on 
table and straight arithmetic. 


The writer is deeply indebted to Professor Irving W. Burr of Purdue Uni¬ 
versity for valuable suggestions in this study. 

REFERENCES 

[1] I. W. Btjrb, “Cumulative frequency functions,” Annals of Maih, Stai., Vol. 13 (1942), 

pp. 215-232. 

[2] C. C. Oraio, “A new exposition and chart for the Pearson system of frequency curves,” 

Annals of Math. Stal , Vol 7 (1936), pp 16-28 



ABSTRACTS OF PAPERS 


(Presented at the Berkeley Meeting of the Institute, June 16-18, 1949) 

1. Extension of a Theorem of Blackwell. E. W. Bakvn'kim, Unlvorsity of Cali¬ 
fornia, Berkeloy. 

It is proved that Blackwell's method of uniformly improving the variance of an un¬ 
biased estimate by taking the conditional e\pcctation with respect to a sufTicient statistic, 
is, in fact, similarly effective on every absolute central moment of order s g 1. The method 
leads to finer detail concerning the relationship betwoeu an osLiinatu and its thus derived 
one, (This paper was prepared with the partial support of the Office of Naval Research.) 

2. On the Existence of Consistent Tests. Agnes Beugeii, Columbia University., 
New York. 

Let SK(S8) denote the space of all probability-measures defined over a common Borcl- 
field 58. Let (m) =» M, \ni‘\ => .1/' be two disjoint subsets of iDl(58) and lot//ii (Ri) bo the 
hypotheaia stating that the unknown distribution is in M (.11'). In Neyman’s terminology 
Ho can be consistently tested against Hi if to any preassigned e > 0 there evists an integer 
n and a critical region in tlio product-space of n independent ob.servations such that the 
probabilities of the errors of the first and second kind corresponding to this region are 
simultaneously smallei than e. A sufficient condition which fur a certain type of consistent 
tost 18 also necessary is established. The condition is satisfied whenever the disjoint sets 
M and M' are closed and compact with respect to a certain suitable topology introduced 
on SDi(S8). Thus for instance Ila can be consistently tested against Ih if II and M' contain 
only a finite number of measures or if the measures in M reap. M' depend continuously on 
a parameter ranging over a closed and bounded subset of some Euclidean space. 

3. Effect of Linear Truncation in a Multinormal Population. Z. Willi.vm Bikn- 
BAUM, University of Washington, Seattle. 

Let (X, Yi , Tj, , Tn-i) have a non-singular n-diinensional normal probability 
density/(X, 7i , Fj, • , F„_i) for which all parameters are given, and let p(X, Yi , Yt , 

■ • I F„_i) be the probability density obtained from / by truncation along a given hyper- 
plane p = CJ for ciFi -f- • • • -f o„-iF„„i g aX v =• 0 elsewhere. What is the marginal 
distribution of X for this truncated distribution? This question can bo answered by using 
a set of tables with only two jiarameters. These tables make it also possible to solve prob¬ 
lems such as; determine the plane of truncation so that the marginal distribution of X has 
certain required properties. (This paper was prepared under the eponsorship of the Office 
of Naval Research.) 

4. Statistical Problems in the Theory of Counters. (Preliminary Report). Colin 
R. Blvtii, University of California, Berkeley. 

The assumptions made about counteraction and distribution of incident particles are 
the same as those of B. V. Gnedenko [On the theory of Gaiger-Mliller counters, Journ. Ex- 
per % Tear. Phiz, Vol 11 (1941)], The distribution of the number X of particles registered 
during a given time (0, t) is found explicitly, m terms of the density a(r) of incident par¬ 
ticles at time v. The problem considered is that of estimating the parameters of a(i)) Tor 
the special case a{v) = o, the distribution of X reduces to P{X = a) = o*(t — sr)* exp 

464 



ABSTRACTS OF PAPERS 


465 


l-a(i - a:T)l/a:i + exp (-o« - 2T)lzf_5 a.'\t - a:r]‘A' - exp(-a[t - (a; - l)T]l2fZ5 a*[t - 
(x - l)T]'Alforx = 1,2, • • , s = = 0|='e-'’';P(X=s+l) = 1- exp(-a(i - 

a 

St)} — srY/tl] P[X > s + 1) = 0 This distribution has been found m another 

t-o 

problem by J. Neyman [On the problem of eetimating the number of schools offish, submitted 
to Statistical Series, Umv of Calif press]. For this special case the maximum likelihood 
estimate d of a is found to be given by dr exp (dr) = {1 + t/(« — xt)}‘'xt/(J — Xt). If 
r/(i — xt) is small, as will usually be the case, d will be close to the estimate x/(i — xr) 
usually used for o. 

6 Some Two-Sample Tests. Dottglas G Chapman, University of California, 
Berkeley 

Let X, Y be random variables normally distributed with means 4, v , variances vi , va 
respectively. The two sample procedure formulated by Stein to obtain a test with power 
independent of e, for the hypothesis t; = fo is used here to deteimine a teat for the hypothesis 

~ — r (r any pre-assigned real number). The size and power of this test are independent of 
V 

tri and 0-2 . The two sample procedure may be extended to the more general case of testing 
the hypothesis of equaiity of means of several normal populations, the variances being 
unknown. Approximate tests are obtained for this case Finally it is shown that this two 
sample procedure can be used to select that normal population, of several, with the greatest 
mean the rule of selection having a preassigned level of accuracy (This paper was pre¬ 
pared with the partial support of the Office of Naval Research ) 

6. Minimum Variance in Non-Regular Estimation. R. C. Davis, U. S. Naval 
Ordnance Test Station, Inyokern. 

The Cramdr-Bao inequality for the minimum variance of a regular estimate of an un¬ 
known parameter of a probability distribution is extended to a broad class of non-regular 
types of estimation. The theory is developed only for the case in which a probability den¬ 
sity function and a sufficient statistic for the unknown parameter exist For every non¬ 
regular estimation problem included in the above class, it is proved that there exists a 
unique unbiased estimate which attains minimum variance, and a method is given for 
obtaining the sample estimate. Examples are given; such as, the rectangular distribution, 
a class of truncated distributions, etc. 

7 Auxiliary Random Variables. Mark W. Eudey, California Municipal Statis¬ 
tics, Inc., San Francisco 

In testing hypotheses concerning discontinuous random variables it is not possible to 
find regions of arbitrary size, and so if we compare two critical regions, selection between 
them on the basis of the usual criteria of the Neyman-Pearson theory of testing hypotheses 
may be confused by the difference in their sizes. This difficulty may be avoided by allowing 
the statistician to use a mixed strategy in such cases, and make his decision to accept or 
reject the hypothesis depend upon an independent auxiliary random variable. For example, 
if K is a binomial variable, and U has a uniform distribution (0, Ij, then Z => K + U may 
be used to test hypotheses concerning the binomial parameter, and regions of any size may 
be found. For the binomial case this procedure leads to a class of uniformly most powerful 
tests for one-sided alternatives, and to uniformly most iiowerful unbiased tests for two- 



466 


ABSTHACTS OP PAPERS 


Bided alternalivea. Similar resulta are obtained for other oomraon diacontinuoua variables, 
and the aarae device may be used in oonaidermg oonfidonce regions and decision functions 
for Buoh variables. (This paper was prepared with the partial support of the Office of 
Naval Research.) 


8. Estimation in Truncated Samples. Max Halperin, The Iland Corporation. 
Santa Monica, California. 

A death process is conaidered which atarts with n individuals of zero age, each following 
the mortality law,/(a, 9). That la, 


f'(0 =■ Pr (Age at death < 



dx, 


where/(*, fl) is a probability density. We suppose we truncate the process at a fixed time, 
r, and wish to estimate 9 when 

a) individuals who die are not replaced, and 

h) individuals who die are replaced by individuals of zero age following tbe mortality 
law,/(®, 9). 

In both oases, it is found that, under mild conditions, estimation by Maximum Likeli¬ 
hood gives optimum oatimatos. The ostimatos are best in tlic sense of being nsymptotioally 
normally distributed and of rmnimum variance for largo samples. 

The proofs are given for the case of a single parameter, but can bo extended to the multi¬ 
parameter case. Examples are given. 


9. Some Problems in Point Estimation. J. L. Hodges, Jr. and E. L. Lehmann, 
University of California, Berkeley. 

Some point estimation problems are considered in the light of Wald’s general theory. It 
is shown that when the loss function is oonvox, one may restrict consideration to nonran- 
domized estimates based on sufficient statistics. Minimax ostimatos aro obtained in a 
number of cases connected with the binomial and hypergeometrio distributions, and with 
some nou-paramctric problems. Some prediction problems are also considered. (This paper 
was prepared with the partial support of the Office of Naval Research.) 


10. Completeness in the Sequential Case. E. L, Lehmann and C. Stein, Uni¬ 
versity of California, Berkeley. 

Recently, in a series of papers, Girshick, Mostoller, Savago and Wolfowitz have con¬ 
sidered the uniqueness of unbiased estimates depending only on an appropriate sufficient 
statistic for sequential sampling schemes of binomial variables. A coraploto solution was 
obtained under the restriction to bounded ostimatos. This work, which has immediate 
consequences with respect to the existence of unbiased estimatos with uniformly minimum 
variance, is extended here in two directions. A general necessary condition for uniqueness 
is found, and this is applied to obtain a oomplote solution of the uniquenosa problem when 
the random variables have a Poisson or rectangular distribution. Necessary and sufficient 
conditions arc also found in the binomial case without tho restriotion. to bounded estimates. 
This permits the statement of a somewhat stronger optimum property for the estimates, 
SjUd is applicable to the estimation of unbounded functions of the unknown probability. i 

11. The Ratio of Ranges. Rich.vrd F. Link, University of Oregon, Eugene. 

The distribution of the ratio of two ranges from independent samples drawn from a 
normal population is given analytically for ni and n i ^ 3. A table of percentage values, B, 



abstracts op papers 


467 


ia given for a — .006, .01, 026, ,06, ,10 and for all oombinationa of m and nj up to 10, where 
a = Pr (wi/wi > R) and Wi and lOj are the obaerved ranges. (This paper w'as prepared under 
the sponsorship of the Office of Naval Rosearoh ) 


12. Some Problems Arising in Plant Selection and the Use of Analysis of 
Variance. Stanley W. Nash, University of California, Berkeley. 

The yields of many (m) varieties are compared in a field trial. A few varieties having the 
highest and lowest yields in this trial are selected for further testing What chance is there 
that the first trial will give a significant result, the second trial not? Let {, denote the true 
mean yield of the «th variety, and assume that the are themselves normally, independently 
distributed with variance vj. LatP* (k =< 1,2) denote the probability of a significant result 
in the fcth trial, using the P-test. For fixed a\ > 0, Pi " 1. (See Nash, AnnaU of 

Math. Stat , Vol. 19 (1948), p. 434.) Now let vj > 0 take on a decreasing sequence of values 

as m increases Tf ■ , = 0 (- ), then lim„_„ Pi » 1. Here 1 + fflff(m) = 

aiSW \ / 

^(numerator of F) lim»-,„ Pi < 1 if and only if ol =■ . For <rj 


(= error variance) 


= 0 ( = — ) ,limm_„Pj = a, the level of significance used. Thus, corresponding to any 

VVlogm/ ^ 

m, however large, one can find values of tr* for which the chances are considerable (or even 
approaching 1 — a), that the two field trials will give opposite conclusions when the P- 
test IB used. 


13 Asymptotic Properties of the Wald*Wolfowitz Test of Randomness. Gott¬ 
fried E. Noether, Columbia University, New York. 

Let oi, • • , o„ be observations on the chance variables Xi, • • • ,Xn. Wald and Wolfo- 
witz (Annals of Math, Stat., Vol. 14 (1943), pp. 378-388) have shown how the statistic Ri, = 
2?_i x,x,+h , (2n+/ = X,), can be used to test the null hypothesis that the , (i =■ 1, • ■ ■ , 
n), are independently and identically distributed by considering the distribution of JJa in 
the subpopulation of all permutations of the o,. In the present paper it is shown that when 
the null hypothesis is true this distribution of is asymptotically normal provided 
S?_i ~ (r = 3,4, • • ), a condition which is satisfied 

with probability 1 if the a, are independent observations on the same chance variable X 
having positive variance and a finite absolute moment of order 4 + 6, (S > 0). Conditions 
are given for the consistency of the test based on Ri, when under the alternative hypothesis 
observations are drawn independently from changing populations In particular a down¬ 
ward trend and a regular cyclical movement are considered, both for ranks and original 
observations. For the special case of a regular oyciioal movement of known length the 
asymptotic relative efficiency of the rank test with respect to the test performed on original 
observations is found. It is shown that when using ranks, R/, is asymptotically normal 
under the alternative hypothesis provided lim mf„-.„ vni(n~^'‘Ri,) > 0. This asymptotic 
normality of Ri, is used to compare the asymptotic power of the Rk-teat with that of the 
Mann T-test (Econometrica, Vol, 13 (1946), pp. 245-259) for the case of a downward trend. 

14. On the Similar Regions of a Class of Distributions. Stefan Peters, Univer¬ 
sity of California, Berkeley. 

The class of distributions considered is essentially the class of those distributions of ti 
variables which, by a suitable transformation of the variables and the parameter, can be 
transformed into distributions defined in the whole Bn for which the parameter is a location 



468 


ABSTRACTS OP PAPEIiS 


parameter. Those regions satisfy a certain partial diEfereutial equation. The transformed 
distributions of the variables Vi , iJk and parameter r po.ssess a class Di of similar 
regions with respect to t wliieli can be defined as the BiiialleBl additive class of regions 
which includes all regions defined by 

ffl(yi - !/«). ••,(!/« + y»)l £5 0 

where p is a continuous function. The class Di does not exhaust all similar regions, There 
exists among the regions of class Di one which is ino.st powerful for testing a given addi¬ 
tional parameter a If there exists among all similar regions a most powerful region for 
testing cr, then that region will bo the most powerful region of class D\ 

15. Some Problems in Sequential Analysis. Charbeb M. Stein, University of 
California, Berkeley. 

Wald’s fundamental identity foi cumulative sums is extended to dependent random 
variables The first derivative of this at the origin is equivalent to a result of Wolfowitz 
{Annals of Math. Slat., Vol. 18 (10-17), p. 228, Th. 7.4). Higher derivatives of this at the 
origin can,also be obtained from linear combinations of Wolfowitz’s result applied to suitable 
products of the original random variables. These equations yield approximate OC and ASN 
curves for probability-ratio tesla for a simple hypothesis against a single alternative con¬ 
cerning some of the more usual stationary Markoff chains Bounds for tlie amount by 
which the ASN exceeds that of the most efficient teat are also obtained. The results are 
applied in particular to random variables taking on only the values 0,1 with conditional 
probabilities depending only on a finite number of the preceding observations. The case 
of linear dependence of normal random variables with fixed conditional variance is also 
considered, 

16. Some Aspects of Links Between Prediction Problems and Problems of 
Statistical Estimation. Erlino Sverdrup, University of Oslo. 

A prediction is not taken as a probability statement about additional observations of 
the random variable already observed. It is presumed that the slatislioal interpretation 
of the sample will result in some action influencing the random variable subject to predic¬ 
tion. The probability distribution of this random variable is given for each of an a priori 
class of probability functions for the observed random variable and for each of a class of 
possible actions. “Utility” as a function of the random variable to be predicted and of the 
action is defined. It is shown that the problem of which action to take in order to maximize 
expected utility is identical with a problem of statistical inference with a uniquely defined 
weight function in the Wald sense. It is further shown that this procedure is adaptable to 
stochastic processes of a general typo and this provides a means of connecting the theory 
of stochastic processes with the theory of statistical inference. Some examples are given to 
illustrate the general theory. 

17. Some Large Sample Tests for the Median, John E. Walsh, The Rand 
Corporation, Santa Monica, California. 

Consider a large number of independent observations from continuous populations with 
a common median. Sone non-parametric large sample tests for the population median are 
presented which are based on either two or three order statistics of the sample If all the 
populations are symmetrical, these tests are equal-tailed with specified significance level a. 
If the observations are a sample from a normal population, these tests have high power 
efficiencies. Some tests based on three order statistics are developed which also have signifi- 



ABSmCTS OF PAPERS 


469 


cance level a if all the populations are not symmetrical; however, in this case the resulting 
test is one-tailed instead of equal-tailed. Using these tests for eituations where the popula¬ 
tions are believed to be symmetrical furnishes a safety factor with respect to Type I error. 
Tests are presented for the special case where each population is either symmetrical or 
skewed in a specified direction. If the populations are not symmetrical the significance 
level distribution is .4a to one tail and .6a to the other, rather than .5a to each tail. Also 
some non-parametrio large sample tests of whether a sample is from a symmetrical popula¬ 
tion are derived. These tests are based on three order statistics of the sample and have 
bounded significance levels. 

18. Continuous Sampling Plans from the Risk Point of View. Zivia S. Wurtelb, 
Stanford University, California. 

The quality of a lot can be improved by a screemng process whereby the defective items 
found during inspection are replaced by non-defective items The type of sampling plan 
adopted will generally depend upon the cost of inspecting items, the number of defective 
items in the lot prior to inspection, and the loss due to defective items remaining m the lot 
after inspection. The loss if the lot is accepted after d defectives are found in a sample of 
n items is equal to c(n) + /i(I)) where D is the number of defectives left m the lot and c(n) 
is the cost of inspecting n items. An inspection procedure S is defined by a set of stopping 
points ((d, 7i) 1. Let r(p, 5) be the expected loss if p is the probability of a defective and 
the procedure S is used It is assumed that the lot is obtained from a binomial population 
For any a prion distribution F(p), a Bayes procedure is one which minimizes the expected 
risk, 

1 

r(P,Smp). 

A systematic method of obtaining Bayes solutions exists, but the computations are formid¬ 
able. Under fairly general conditions the Bayes solutions are shown to be multiple sampling 
plans, in which the size of the zth sample depends upon the number of defectives in the 
(i - l)st sample, In particular, if the production is m a state of statistical control, a 
Bayes solution is a fixed sample size It is also shown that for most reasonable loss func¬ 
tions, there exists no mini-max procedure which is uniformly better than the trivial one, 
namely, the Bayes proceduie if p «= 1. 


I 



NEWS AND NOTICES 

Headers are invited to auhmil lo the Secretary of the Institute news items of interest 

Personal Items 

Dr. Irving Burr has been promoted to a full professorship at Purdue University. 

Dr. D, A. S. Fraser, who received his Ph.D. degree at Princeton University in 
June, has accepted a position as Instructor of Mathematics at the University 
of Toronto. 

Dr. H, K. Hartline, formerly at the Johnsen Eeaearch Foundation of the Uni¬ 
versity of Pennsylvania, has accepted an appointment as chairman of the Thomas 
Jenkins Department of Biophysics, Johns Hopkins University. 

Dr. Leo Katz has been promoted to an associate professorship in the Mathe¬ 
matics Department of Michigan State College, East Lansing, Michigan. 

Professor D, D, Kosambi of Tata Institute for Fundamental Research, Bom¬ 
bay, India served as Visiting Professor at the University of Chicago for the 
Winter Quarter. 

Dr. H. G. Landau has resigned his position wth the Ballistic Research Labora¬ 
tories and is now a Research Associate with the Committee on Mathematical 
Biology, at the University of Chicago. 

Mr, Allen L, Mayerson, formerly an Associate in the Division of Statistics 
and Research of the Institute of Life Insurance at New York City, has accepted 
a position with the National Surety Corporation of New York. 

Mr. Raymond P. Peterson, who has been an Assistant in the Mathematics 
Department of the University of California at Los Angeles and also a graduate 
student there, has accepted a position with the Institute for Numerical Analysis 
at Los Angeles., 

Professor Edwin J. G. Pitman has returned to the Mathematics Department 
of the University of Tasmania after spending about a year and a half in the 
United States. From February to June of 1948 he was at Columbia University 
as visiting Professor of Mathematical Statistics, The rest of the time was spent 
at North Carolina and Princeton, 

A.‘Ananthapadmanabha Rau has returned to India after studying at the 
Statistical Laboratory in Ames, Iowa. In addition to heading the Department 
of Statistics and Agriculture Meterology of the Government of the State of 
Mysore, India, he is working on sampling design of experiments, and climatology 
and teaching statistics and climatology at the College of Agriculture. 

Dr. Andrew Sobczyk of Watson Laboratories has been appointed to an 
assistant professorship at Boston University. 

Assistant Professor S. L. Thompson of Alabama Polytechnic Institute has 
been promoted to an associate professorship. 

William J. Youden is acting as Assistant Chief of the Statistical Engineering 
Section of the National Bureau of Standards and as special advisor to the Direc¬ 
tor on the problems of statistical and mathematical design of major experiments 
in physics, chemistry and engineering. 


470 



NEWS AND NOTICES 


471 


Two Doctorates in Mathematical Statistics were awarded at the University of 
North Carolina in June, 1949. The recipients were Uttam Chand, who has now 
been appointed Assistant Professor of Mathematics at Boston University, and 
Ralph A. Bradley, who will be Assistant Professor of Mathematicfe at McGill 
University. 


The Educational Testing Service, Princeton, N. J., announces the appointment 
of Elbert Lee Hoffman and William Edward Kline as ETS Psychometric Fellows 
for 1949-60 for graduate study in psychology at Princeton University. Mr. 
Hoffman is a graduate of the University of Oklahoma, and Mr. Kline has received 
both his bachelor’s and master’s degree from Yale University. Bert F. Green, Jr. 
and Warren S. Torgerson have received reappointments as ETS Psychometric 
Fellows. Each Fellow carries a full program of graduate study in psychology at 
Princeton University, including basic work in experimental and theoretical 
psychology. Special training is also given in mathematical statistics and modem 
quantitative methods as applied to psychological problems in such fields as 
learning, testing and attitude measurement, as well as in the techniques of 
developing aptitude and achievement tests. In addition to the graduate program 
in psychology, each Fellow spends part-time m training and research work with 
the Educational Testing Service. 


Preliminary Actuarial Examinations 
Prize Awards 

The winners of the prize awards offered by the Society of Actuaries to the 
nine undergraduates ranking highest on the score of Part 2 of the 1949 Pre¬ 
liminary Actuarial Examinations are as follows: 

First Prize of $200 

Moran, Joseph W .... Yale University 


Additional Prizes of $100 

Farmer, Thurston P., Jr,. 
Haakenstad, Dale L .. . 
Hauke, William V . . 
Lordan, Joseph D. . 
Mayberry, John P. 
Murch, Alan D . 

White, William A 
Zemach, Ariel. 


State University of Iowa 
University of Michigan 
University of Michigan 
Mafisachusetts Institute of Technology 
University of Toronto 
University of Toronto 
Dartmouth College 
Harvard University 


The Society of Actuaries has authorized a similar set of nine prize awards 
for the 1950 Examinations on Part 2. 

The Preliminary Actuanal Examinations consist of the following three 
examinations: 

Part 1. Language Aptitude Examination, 

(Reading comprehension, meaning of words and word relationships, antonyms, 
and verbal reasoning ) 




472 


NEWS AND NOTICfM 


Part 2, General Malhanalics Kcaminalion. 

(Algel)ra, trigonnmptiy, (!(inrdiiifiti> gpotndry, tlifT(>r('ntial tiiirl iiitt'grnl calculus,) 


Part 3. Special Maihcmalte^ ExamJnnlinn. 

(P'initc (liirprcncM, prohuliility anil ataliRlirs.j 


The 1950 Preliminary Actuarial Exaininatioris will he administered by the 
Educational Testing Stu'vice at centers (hrnughemt the I'nited States and 
‘Canada on May 19, 1950. The closing dale for npplications is March 15, 1950. 
Detailed information concerning the Examinatirms can be obtained from: 

The Society of Actuaries 
208 vSouth LaSalle Street 
Chicago 4, Illinois 


New Members 

The following persona have been elected to membership in the Inalitule 
(March 1, 1949 to May 31, 1949) 

Alcantara de OlWeIra, Eduardo, Ph.D., (Univ dc Kao Paulo) Professor, Faculdade do Pilo- 
aofia, University of Kao Paulo, Rua Sergipe, BO-Ap, SS, Sao Paulo, lirazil, 

Ashby, Wallace L, A P. (George VV'ashington Univ.) Agricultural Ktnlislicmn, Jocelyn 
Stieel, Washington IS, !), C. 

Bailey, Edward W,, li.Ch (Ohio State Ifniv.) Quality ('ontrol Supervisor, Carbide and 
Carbon CheniicalB Corporation, Y-12 Plant, 101 Moylan Lnnr, Oak Ridge, 7’mnessce, 

Berger, Agnes P., Ph.U, (Budapest) 10 Park At’cnuo, AVte York, S'tw York. 

Brown, Walter C., B S. (Colorado A&M College) (Irndtiaie Assislant, Depurtnient of Mathe¬ 
matics, University of Oklahoma, IISO Trout, Norman, Oklahoma. 

Calvin, LyleD., B.S (Univ, of Chicago) Research (Irnduale Assistant, Institulcof Slatistica, 
Morlh Carolina State College, Raleigh, North Carolina 

Carlyle, Charles G., B.S. (Univ, of Illinois) Graduate student at University of Illinois, 
C-SS Stadium 'Terrace, Champaign, Illinois. 

Chen, Yu-nien, M.A. (Harvard) Graduate student. Harvard University, UfI-60, Apt. D, 
Charter Road, Jamaica S, Now York. 

Clark, Fred J., Jr., B.S, (Colorado A&M College) Graduate Assistant iit University^ of 
Illinois, Department of Mathematics, (II A Cowl 0, Stadium Terrace, Champaign, 
Illinois. 

Cohen, Samuel E., M.A. (Univ, of Pennsylvania) Klalistician, U. ,S. Bureau of Labor Sta¬ 
tistics, 49 Oalveslon SL, 5.IP., Washington SO, I). C. 

Cole, Randal H., Ph.D, (Univ. of Wisconsin) Assoeiato Professor, University of Western 
Ontario, laDudon, Canada, 

Comrey, Andrew L., Ph.D, (Univ. of iSouthorii Calif.) Assialanl Professor of Psychology, 
University of Illinois, Urlmna, Illinois. 

Cook, Ellsworth B., B.S. (Springfield College) Head of Visual,Screening Devices Research 
and Statistics Facility, U, S. Naval Alodical Rosoiirch Laboratory, Rox IfB, Submarine 
Base, New London, Conneclicut. 

Cox, David R., Ph.D. (Leeds, lOtigland) Statisllciaii, Wool Industries Research Association, 
3 Sunset Avenue, Leeds 8, Yoiks, England. 

Denbow’, Carl H., Ph.D. (Univ, of Chicago) Associate Professor of Mathematics, U, S 
Naval Postgiaduato School, AnnnpoUs, Maryland, 

Dillon, Gregory M., AB. (Long Island Umv ) Statistician, Pension Statistics Section, 



NEWS AND NOTICES 


473 


Treasury Department, E I DuPont de Nemours & Co , 1331 Cedar Street, Wilmington, 
Delaware, 

Duarte, Geraldo Garcia, Licenciado em Matematica (Faouldade de Filosoiia de B Bento) 
Assistente da Faculdate de Higiene e Saude Publioa, Caixa Postal 99B, Sao Paulo, 
Brazil. 

Dudman, John A., B A. (Reed College) Graduate student, Columbia University, 66 West 
70th Si., New York S3, New York. 

Edelson, Howard, B.A (Ohio State Univ , Columbus, Ohio) Graduate student and Graduate 
Assistant, Ohio State University, 794 S 18th Si , Columbus 6, Ohio. 

Feron, R., Lioencie es Sciences, (Univ of Pans) Attache de Rechetce, 13 rue des Feuillan- 
tines. Pans V, France 

Franck, Edward Michel, Inpineur A.I A., Professor of the Royal Military School, 104 Rue 
Pere Devroye, Woluwe St. Pierre, Belgium. 

Garrltsen, Florence M., BA (Univ of Michigan) Research Assistant, General Motors 
Corp., 6161 Lillibridge Ave , Detroit IS, Michigan. 

Gelsoznml, Thea, Ph D (Univ. of Bocconi, Milano) Assistician of Statistics at Department 
of Statistics, University of Bocconi, Via A Stoppi, N 10, Milano, Italy. 
Goudswaazd, G, Ph D (Univ of Leiden) Director, Permanent Office, International Sta¬ 
tistical Institute and Lecturer of Statistics, Rotterdam School of Economics and Free 
University of Amsterdam, 2 Oostduinlaan, The Hague, Netherlands. 

Gucker, Frank Fulton, A B (Harvard Univ ) Statistical Engineer, Remington Arms Co , 
Inc , 3176 Main Street, Bridgeport 8, Connecticut. 

Haberman, Sol, B.A. (Brooklyn College) Assistant Visiting Professor of Sociology, Univer¬ 
sity of Puerto Rico, 187 Avenida los Flamboyanes, Rio Piedras, Puerto Rico 
Helmbach, Ernest E., M.B A (New York Univ ) Professor of Economics, Bergen College, 
Teaneck, New Jersey, 66 West 11th Street, New York 11, N. Y 
Ishii, Shigeru, B,A. (Univ of Ill ) Student at University of Illinois, 320-1 Peabody Drive, 
Parade Ground Units, Champaign, Illinois 

Jackson, James Edward, M.A (Univ of N C ) Statistician, Color Control Dept , Eastman 
Kodak Company, 200 Pershing Drive, Rochester, New York. 

Jaspen, Nathan, Ph D (Pennsylvania State College) Research Assistant, Department of 
Psychology, Pennsylvania State College, State College, Pennsylvania 
Jonhagen, Sven, Fil Lie. (Univ, of Stockholm) Chief Actuary and Assistant Teacher in 
Statistics at the University of Stockholm, Tegnergatan 36, Stockholm, Sweden 
Kiefer, Jack C., M.S (Mass Inst, of Tech.) Student, Department of Mathematical Sta¬ 
tistics, Columbia University, 3826 Middleton Avenue, Cincinnati 20, Ohio. 

Kraft, Charles Hall, B A. (Mich. State College) Instructor, Mathematics Department, 
Michigan State College, 707D Chestnut Road, Bast Lansing, Mich 
Mayne, John W., M Sc (Brown Univ.) Graduate student in Mathematical Statistics, 330 
Furnald Hall, Columbia University, New York 27, New York. 

McCabe, William J., M.A. (George Washington Univ.) Chief Statistician, Transportation 
Corps, Department of the Army, 1726 South Oakland Street, Arlington, Virginia. 
Medln, Knut H., M A (Univ of Uppsala) Assistant, Statistical Institute, University of 
Uppsala, Odinslund 2, Uppsala, Sweden. 

Mewborn, A. Boyd, Ph.D. (Calif Inst of Tech.) Associate Professor of Mathematics and 
Mechanics, P.O. Box 1748, Monterey, California. 

Minton, Paul D., M S (Southern Methodist Univ.) Graduate student, University of North 
Carolina, P O Bo.x 634, Chapel Hill, North Carolina 
Morris, Doris N., M A. (Columbia Teachers College) Economics Assistant, Western Electric 
Co , 101 West 72nd Si., New York 23, New York 
Morris, Robert H., B A (Swarthmore) Development Engineer, Color Control Department, 
Eastman Kodak Co , Rochester, New York. 



m 


NEWS AND NOTICES 


IUjalakfihniMi> D, V„ M*So. (Madrfta Univ.) Head, Department of Statiaticfl, University 
ofMadraSiMadraafi, S-India, 

Rudy, Norman, M.B,A. (Univ. of Chicago) Scientiat, Ordnance Researcli Project, Univer¬ 
sity of Chicago and Instructor in Economics, Roosevelt College, 7J0S S, Cmdon 
iceaue, Cfiica^o ,|5, lllinoti. 

Sakk, Kaarel, Eil. kand, (Univ. of Stockholm) Officer at Research Bureau of the State Food¬ 
stuffs Commission, Ostermalmsgatan 67 o.g. Ill, Stockholm, Sweden. 

Singh, Jagjlt, B,A, (Punjab Univ.) Superintendent Transportation, E, I. Railway, Dinapote, 
c/o ., Foil Offics BonUI, Calculla, India. 

Starr, Henry H„ Ph.D, (Univ, of Vienna) Research Manager, Converted Rioe, Ino, P.O, 
Box 1762, Houston 1, Texas. 

Sverdrup, Erllng, Aotuarian (Univ. of Oslo) Lecturer in Mathematical Statistics, Institute 
of Mathematics, University of Oslo, Oslo, Norway. 

TaUcko, Joseph Y., Ph,D. (Charles Univ., Prague) Assistant Professor of Mathematics, 
Marquette University, HS03 So, Mh Stfotl, Mrlmukeo 7, ?f«cortsm. 

Templeton, James G. C,, A,M. (Princeton Univ.) Graduate student at Princeton Univer¬ 
sity, line Hall, Princeton University, Princeton, Now Jersey, 

Vaughan, Elizabeth, B S. (Univ, of Washington) Statistician, U.8, Fish and Wildlife Service, 
2725 Montlake Boulevard, Seattle 2, Washington. 

Wilkinson, Bryan, M.A, (Univ. of Nebraska) Personnel Research Specialist, Prudential 
Insurance Co,, Western Home Office, 1/50-B Muirjkld Road) Lot ingelts jS, Galu 
jomk 

Yevlck, Mariam A, L., Ph,D, (Mass. Inst, of Tech.) Staff, Division of Statistical Engineer¬ 
ing, National Broadcasting System, 9$1 Hudton 8l,, Hoboken, Hew Jemy. 



REPORT ON THE BERKELEY MEETING OF THE INSTITUTE 


The thirty-ninth meeting and fifth regional West Coast meeting of the In¬ 
stitute of Mathematical Statistics was held on the Berkeley campus of the 
University of California, from Thursday June 16 through Saturday June 18, 
1949. The session on June 17 was held jointly with the Biometrics Section of the 
American Statistical Association and the Biometric Society (Western N. A. 
Region). Sixty-six persons registered, including the following fifty members of 
the Institute: 

Jane F. Andrian, G A Baker, Z. Wm. Birnbaum, Cohn R Blyth, Albert H Bowker, 
Paul T. Bruyere, Chin Long Chiang, Edwin L Crow, John H Curtiss, R C, Davis, Carl 
H. Denbow, W J Dixon, Mary Elveback, Mark Eudey, Edward A. Pay, Evelyn Fix, 
William R Gafiey, II H Germond, M. A Gnehick, Jack Gysbers, Max Halperin, J. L. 
Hodges, Jr , John M Howell, Harry M Hughes, Cuthbert Hurd, Terry A. Jeeves, Mark 
Kac, H.S Konijn, GeoigeM Kuzneta, Erich L Lehmann, Richaid F Link, Michel Lo^ve, 
Prank Massey, Lincoln E Moses, Edith Mourier, Stanley W Nash, J. Neyman, Edward 
Paulson, Stefan Peters, Raymond P Peterson, Robert I, Piper, Gladys Rappaport, Mina 
Rees, David Rubinstein, Elizabeth L. Scott, Esther Seiden, Charles M. Stein, John E. 
Walsh, John Wishart, Zivia S Wurtele. 

Those attending were welcomed at the Thursday morning session by Edward 
W. Strong, Associate Dean of the College of Letters and Science, University of 
California. Professor Z. William Birnbaum of the University of Washington 
presided. 

The program was as follows: 

1. Recent advances in the theory of the Wishart distribution. (Invited paper.) 
John Wishart, Cambridge University. 

2 Bayes, mmimas, and other approaches to the multiple classification problem. 
(Invited paper.) M A. Girshick, Stanford University. 

3. Some problems in sequential analysis. Charles M. Stem, University of 
California, Berkeley. 

Professor Jerzy Neyman of the University of California, Berkeley, presided 
at the Thursday afternoon session. Midway in the program there was an inter¬ 
mission for a tea given by the Statistical Laboratory, University of California. 
The program was as follows: 

1 Completeness in the aeguential case. E, L. Lehmann and C. M. Stem, University of 
California, Berkeley 

2, iSome large sample tests for the median JohnE Walsh, The Rand Corporation. 

3 Continuous sampling plans from the risk point of view Zivia S Wurtele, Stanford 
University 

4 Some problems in point estimation J L Hodges, Jr and E L Lehmann, University 
of California, Berkeley 

5. Minimum variance in non-regular estimation R. C. Davis, U. S Naval Ordnance Test 
Station, Inyokern. 

6 Some aspects of links between prediction problems and problems of statistical estimation 
Erling Sverdrup, University of Oslo. 


475 



476 


REPORT ON BERKELEY MEETING 


7. Extension of a theorem of Blackwell. (By title). Edwai'd W Barankin, University of 
California, Berkeley 

8. Some two-sample tests. (By title). Douglas G. Chapman, University of California, 
Berkeley. 

9. On the existence of conBialont lesla. (By title). Agnes Berger, Columbia University. 

Professor F. W. Weymouth of Stanford University presided at the Friday 
morning session on biometrics. The program was as follows; 

1. Statistical problems arising from research in luberculoais. Martha and Paul T, Bruyere, 
U. S Public Health Service. 

2. Correlation of variability with growth rate in fish and mollusks. F W. Weymouth, 
Stanford University. 

3. Some problems arising in plant selection and the use of analysis of variance Stanley 
W Nash, University of California, Berkeley. 

4. Studies of resistance of strawberry varieties and selections to verticillum wilt R. E Baker 
and G. A. Baker, University of California, Davis. 

6. A uniformity trial on unirngated barley of ten years duration with implications for 
field trial designs. F. J. Veihmeyor, M. R. Huberty, and G. A. Baker, University of 
California, Davis and Los Angeles. 

On Friday afternoon those attending the meeting were entertained at a picnic 
luncheon at Stanford University, given by the Department of Statistics, Stanford 
University. 

Professor 0. B. Morrey, Jr., of the University of California, Berkeley, presided 
at the Saturday morning session. The program consisted of the following invited 
papers: 

1. Methods for getting limiting distributions. MarkKac, Cornell University. 

2. Almost certain convergence. Michel Lo6ve, University of California, Berkeley 

At 11 o’clock Saturday morning a business session was held, under the chair¬ 
manship of Professor Jerzy Neyman of the University of California, Berkeley, 
for the purpose of discussing future West Coast mootings. Plans for reviving the 
Statistical Research Memoirs were also discussed. 

On Saturday afternoon a final session for contributed papers was held under 
the chairmanship of Professor Albert H. Bowker of Stanford University, The 
program was as follows: 

1, Effect of linear truncation in a multinormal population, Z. William Birnbaum, Univer¬ 
sity of Washington 

2. Estimation in truncated samples. Max Ilalpcrin, The Rand Corporation, 

3 On the similar regions of a class of dislribuHons. Stofan Poters, University of California, 
Berkeley 

4, Auxiliary random variables, Mark W. Eudoy, California Municipal Statistics. 

6, The ratio of ranges Richard F. Link, Univeraity of Oregon. 

6, Slalisitcal problems in the theory of Geiger counters. Colin R. Blyth, University of 
California, Berkeley. 

7, Asymptotic properties of the Wald-Wolfoioitz test of randomness. (By title). Gottfried 
E. Noether, Columbia University 

J. L. Hodges, Jr. 
Assistant Secretary 



LOCALLY BEST UNBIASED ESTIMATES* 

By E, W. Bahankut 
University of California, Berkeley 

Summary. The problem of unbiased estimation, restricted only by the postu¬ 
late of section 2, is considered here. For a chosen number s > 1, an unbiased esti¬ 
mate of a function g on the parameter apace, is said to be best at the parameter 
point ^0 if its sth absolute central moment at ft is finite and not greater than that 
for any other unbiased estimate. A necessary and sufficient condition is obtained 
for the existence of an unbiased estimate of g. When one exists, the best one is 
unique. A necessary and sufficient condition is given for the existence of only 
one unbiased estimate with finite sth absolute central moment, The sth absolute 
central moment at ft of the beat unbiased estimate (if it exists) is given explicitly 
in terms of only the function g and the probability densities. It is, to be more 
precise, specified as the l.u.b. of certain set (J of numbers. The beat estimate is 
then constructed (as a limit of a sequence of functions) with the use of only the 
data (relating to g and the densities) associated with any particular sequence 
in 3 which converges to the l.u.b. of 2. 

The case s = « is considered apart. The case s = 2 is studied in greater 
detail. Previous results of several authors are discussed m the light of the present 
theory. Generalizations of some of these results are deduced. Some examples 
are given to illustrate the applications of the theory. 

1. Introduction. Let fl be a space of points x, and be a totally additive 
measure defined on a v-field 5"of subsets of il. Let 'ip = (pf, S e 0} be a family of 
probability densities in fl with respect to the measure p. 0 is any index set; we 
lay down no conditions on its structure. We are concerned here with the existence 
and characterization of unbiased estimates of a real-valued function g on 0, 
which are in some suitable sense “best" for a prescribed parameter point ft. 
That is, a real-valued, measurable (fi) function /o on Q such that 

(1) \ hVtdii = gift), S<@, 

Ja 

and which satisfies a specified criterion of bestness for 9 = ft. This criterion is 
usually taken to be 

(2) f (fa - g(9^)y Vh dfi ^ f (f - 5(ft))*P*o dn, f « 

JQ VO 

where denotes the class of all unbiased estimates of g-, i.e,, the class of all f 
satisfying (1). The obvious advantage in the definition (2) is the algebraic 

1 This article was prepared while the author Was under contract with the Office of Naval 
Research. 


477 



478 


E. W. BAttANKIN 


pliability. The obvious disadvantage is that may contain no estimate with 
finite variance (cf. section 9). 

For the investigation of the fundamental questions, posed above, relating to 
unbiased estimates, we shall not restrict ourselves to (2). We consider chosen 
and fixed, a number s > 1, and lay down the 

Definition, /o < fa beat at do if 

«> > f l/o ” 1' P«o ^ [ 1/ - ff(l>o) r Pit dn, f e®. 

Jfl Jo 

With this, and under the condition of a rather natural postulate on i]} (cf. section 
2), we exhibit a necessary and sufficient condition for the existence of an unbiased 
estimate of g having a finite sth absolute central moment at do 

Except for the discussion, in section 3, of the case in which g is constani on 0, 
we do not consider directly the estimation of g, but rather that of h — g ~ g(0o). 
Lemma 1, of section 2, gives the solution of the problem for g when that for h is 
Icnown After section 3, it is lussumcd exclusively that h is not S"0, except where 
the contrary is explicitly stated. 

In case s is finite, the existence theorem section 4, Theorem 2, asserts also the 
uniqueness of the best unbiased estimate of h. It is interesting to ob.servo the 
similarity between tlie proof of this uniqueness and Fisher’s proof of the (what 
might be called) asymptotic uniqueness of an efficient estimator [2 pp. 704, 705], 
The case s = mMs discussed in section 5; in tliis case we find that, in general, 
the best estimate is not unique. However, for s both finite and infinite, and as 
well when g is constant (.‘. h s 0), wo give a necessary and sufficient condition 
that there be a unique unbiased estimate with finite s.a.c.m.'* (cf. section 4, 
Corollary 2-1, and section 5, Theorem 3 (iii)). 

Theorem 2 determines the s.a.c.m. of the best estimate as the l.u.b, of a set of 
numbers given explicitly; and thereby, in particular, throws open the class of 
all lower bounds of the minimum s.a.c.m. Investigations after such lower bounds, 
in the classical case s = 2, have led to the well-known results of Cramdr-Rao 
[3 p. 480, (32.3.3)], and Bhattacharyya [4, p. 3, (1.10)]. In section G, which is 
devoted to obtaining various special lower bounds, we show how those particular 
bounds fall out. It should be lemaiked, however, that our conditions on ’ip are 
in general different from those of the aliove authors. 

® For the oasQ a = 2 an alternative oxistenoo condition, antedating those resuits, but not 
yet published, has boon obtained by C. Stein. 

’ If wo use, in the above definition, the sth root of the sth alisoluto central moment, 
instead of the latter itself, then the bostnoss criterion for s ■= w is the limiting criterion 
for s —» CO ; VIZ., 

M > ess sup. I /o — g(0t)) | g ess. sup ]/ — g(fia) \ , /'« 5ii, 

x<n 

where ess. sup. refeis to the measure v^A) = / dl*. 

* The abbreviation s.a c m. will henceforth be used to indicate sth absolute central 
moment at 0o. 



UNBIASED ESTIMATES 


479 


In section 7 we give, in Theorem 7 and its corollary, a construction of the 
best estimate, depending only on the knowledge of the minimum s.a.c.m. The 
latter, as indicated in the preceding paragraph, is always Itnown independently 
of any knowledge of the best estimate. We use these results to obtain explicitly 
(Theorems 8 and 9) the beat estimates, for arbitrary s, in two cases where 
we assume the minimum s.a.c.m. known. These eases, when s = 2, give the 
minimum variance as determined by the equality sign in the Cram^r-Rao and 
Bhattacharyya inequalities, respectively. 

Section 8 is given to a brief discussion of the special case s = 2. Finally, in 
section 9, we present a detailed study of an example 

At the suggestion of the referee we have added an appendix in which is given a 
brief running description of the fundamental ideas of Banach spaces that come 
into use here The italicized phrases are those mentioned explicitly in the course 
of the paper 

We shall merely mention here certain points which will be elaborated further 
in future communications. (1) The general theory developed here pertains as 
well to sequential as to nonsequential estimation; one has only to make the 
proper identification of i), IF, ix, and Moreover, as applied to sequential 
estimation, the theory will determine the optimum stopping regions. (2) The 
discussion of section 5 below can be carried through with “ess. sup.” referring 
to the measure and Si being the space of functions on n which are integrable 
(fi); and for this, no restrictions whatsoever on the densities pe are required 
(of. the postulate of section 2), since the ps are elements of this 8i solely by 
virtue of their properties as probability densities. This development would, for 
example, be sufficient to yield the estimate of Girshick, Mosteller, and Savage [5] 
in the case of sequential binomial estimation. Also, this unrestricted analysis is 
fundamental for the problem of similar regions (a case of the bounded unbiased 
estimation of a constant function). (3) For any s > 1 it may be observed in the 
result of Theorem 7 below, that the best (at 0o) estimate depends only on a 
sufficient statistic, this is clear from Neyman’s theorem on sufficient statistics 
[6], since the best estimate depends only on ratios of the density functions pt 
But more than this, Blackwell’s method [7] of deriving a uniformly (over the 
parameter set) better unbiased estimate from a given unbiased estimate can be 
proved to remain valid also when the measure of dispersion is the sth absolute 
central moment, s > 1 And for this, the postulate of section 2 is not required. 
(4) Finally, we point out that, with the proper specializations of ©, Cram&’s 
theorem on the ellipsoid of concentration [8], Bhattacharyya’s multidimensional 
inequality [9], and the extensions of the Rao, Cramer, and Bhattacharyya 
bounds to sequential estimation—as, for example, by Blackwell and Girshick 
[1], Wolfowitz [10], and Seth [20]—can be drawn from Theorem 4 below. 

The inspiration for the mode of analysis in the following pages, and the 
major part of its substance, come from F. Riesz; his book [11 Ch. Ill] and the 
article [12] (in particular sections 8-11 thereof) In strictly mathematical ter¬ 
minology, Theorems 2 and 3 are given in [11] for the sequence-spaces -fr i and 



480 


B. W. BAIUNKIN 


Theorem 2 in [12] for the spaces S, of functions on the real interval [0, 1] with 
Lebesgue integrable rth powers. The proofs aro given there for the case of a 
denumerable set 0; in [12] an indication is given of the extension to a non- 
denumerable 0. Our proof of Theorems 2 and 3, however, follows that given by 
Banach [13, p. 74] for the case of denumerable 0. It is baaed on two results, a 
theorem of Hahn-Banach [13, p. 55, Theorem 4], and the representation theorem 
(suitable for the general type of 8r that we consider) for bounded linear func¬ 
tionals on S?r [14, p. 338, Theorem 46]. The first of these, and the representation 
theorem for any r > 1, spring in fact from the same article [12, p. 475] of Riesz. 
In the case r = 1, the representation theorem is due originally to Steinhaus [15]; 
in the case r = 2, it was developed simultaneously in 1907 by Riesz [16] and 
Fr4chet [17]. 

Riesz’ proofs of the sufficiency of the condition in Theorem 2 proceed by 
constructing an explicit sequence of functions on which converge strongly in 
8, to the (in the present statistical terminology) best estimate. Precisely, if in 
Theorem 7 below, we take, for each n = 1,2, • • • , the numbers , a? , ■ • , 
so that the expression 

_ 

t X., 

r 

is maximum, then the assertion of this theorem is that of Riesz. However, 
Theorem 7 is established here without this strict requirement on the a" . The 
dropping of this restriction was essential for the proofs of Theorems 8 and 9. 
The latter two theorems are, in fact, proved with the use of Corollary 7-1, 
which is an even stronger result than Theorem 7. This corollary falls out of the 
proof of Theorem 7 immediately, in consequence of our use of Lemma 2 for that 
proof. The lemma, moreover, eases the proof of Theorem 7 markedly, in doing 
away with the need for any differentiation. 

2. Preliminary considerations. We begin then by introducing the absolutely 
continuous (with respect to n) measure, defined on CF, 

^(A) = £ dn, A 

A function (j) is summable (v) over if and only if if> • pjj is suramable (a) over 
12; and we have 



(cf. [18, pp. 36-38]), Assuming that each of the ratios 


T»{x) = 


V>{^) 

VtS^) ’ 


S«0 



UNBIASED ESTIMATES 


481 


is defined almost everywhere (/i) throughout Si, it follows that / is an unbiased 
estimate of 3 if and only if 


(3) f fn dp = g(. 8 ), 

We define 


Since 


h(») = giff) - g{9a). 

f Ttf dv = 1 , 


e t ©. 


& 6 


it is clear from (3) that / is an unbiased estimate of g if and only if /. — ^(Sj) is 
an unbiased estimate of h. Moreover, / is best, for g, at 60 if and only if / — g(9o) 
is best, for h, at ffo. 

Define 


s 



and let 8 , and C, be the spaces, normed in the usual way, of real-valued functions 
on Si, with summable (v) absolute rth and sth powers, respectively. We denote 
the respective norms by || |(r and || ||<; that is, if J e S?r and n e 2 i, 

and 

We note that these spaces, for s < «, are weakly compact (cf. [21]). This 
property will be used in the proof of Theorem 7. Also, we shall make explicit use 
of the representation theorem for linear functionals on S?, [14, p, 338, Theorem 46], 
The assumptions on iP, or on iPo = ® *0)i wiU now be the following. 

Postulate; The junctions wi are defined almost everywhere (ji) in Si, and are 
elements of 8 r • 

The foregoing considerations combine to give the following equivalence. 

Lemma 1. ^ + p(0o) is an unbiased estimate of g, which is best at ffo, if and 
only if (i) fxs satisfies the equations 

( 4 ) j f>'Vidv = h( 6 ), S € 0 , 

and (li) when is any other fundion satisfying (4), we have 



482 


E. W. BAIUNKIN 


iJiat is, if and only if 00 is an unbiased eslimaie of h with minimum (finite) norm in 
S.. The s.a.c.m. of 0o + g{Bo) is precisely || 00 1|<. 

Starting ivith section 4, we shall deal directly with the estimation of h. 

3. The case of constant g. Throughout the remainder (section 4 et seq,) of 
this article, the function h is assumed, unless the contrary is explicitly stated, 
to be non-constant; that is, since h(do) = 0, not ^ 0. We can, and shall in this 
section, obtain the results of the desired kind for the case of a constant function g, 
by a brief, direct attack. 

Let giB) = go, a, constant. Then of course h{6) 3 0. One unbiased estimate of g 
IS immediately obvious, viz., fi(x) s . The- s.a.c.m. of /i is 0. 

There will exist other® unbiased estimates of g with finite s.a.c.m, if and 
only if there exist non-null unbiased estimate.s, in ii,, of 0 ® h. That is, by virtue 
of the isomorphism between S, and the space of linear functionals on Sir, there 
will exist an unbiased estimate of g with finite s.a.c.m., distinct from fi , if and 
only if there exists a non-null functional on S, which vanishes on the elements of 
iPo = {ir(, 5 «0}. And a necessary and sufficient condition that such a functional 
exist is that ipo be not a fundamental set in li!r [13, p. 58, Theorem 7]. 

Observe finally that, in any case, fi is the unique unbiased estimate of g with 
vanishing s.a.c.m. 

We collect these results in the following statement, 

Ti-ieohem 1 , If gid) ^ go, a constant, then there is a unique best unbiased estimate 
of g\ viz., fi(x) s go , And the s.a.c.m. of fi is 0. 

A necessary and sufficient condition that there exist no other unbiased estimates 
of g having finite s.a.c.m. is that the set iPo be fundamental in Sr • 

As an illustration of the ideas of this section, consider the following example: 

is the real interval [0, 1]; m is Lebesgue measure; 0 is the set of non-negative 
integers; and 

pe{x) = (5 -i- l)x\ 

And take do ~ 0. Then, v is again Lebesgue measure, and ws = pe for each 8. 
For definiteness, take r — 2 (the results in this case are the same for any r ^ 1). 
It IS well-known that the non-negative integer powers of x form a fundamental 
set in Sa on a finite real interval. That is, if $ is a function on [0, 1], such that 

I dx < «5, and if e > 0, then there exist an integer n and coefficients bo, 
Jo 


‘ That is, distinct from fi in the sense of S, j or, equivalently, differing from fi on a set 
of positive (r) measure. Whenever, in the sequel, an equation {1 = fi appears, for two 
functions and it in Sr or S, , equality almost everywhere (v) in n will be understood. 
It is a consequence of our postulate that if two functions on are equal almost everywhere 
(v), they are equal almost everywhere {v'), where »' is anyone of the measures v'{A) = 

I pp dp, 6' e e 

Jj. 



UNBIASED ESTIMATES 


483 


bn such that 


i‘0- 


dx < f, 


Hence, in this case an unbiased estimate with finite variance at ^ = 0 is unique 
(as well for a non-constant function g as for one which is constant over ©; cf. 
section 4, Corollary 2-1). 


4. The main theorem for non-constant h. We shall denote by SDla the class 
(or the set in S,) of all unbiased estimates of h that belong to 8 ,. 

Theorem 2. (i) A necessary and sufficient condition that 30t, be non-empty is 
that there exist a constant C such that for every set of n functions irsi, ttoj , • , ts„ , 
in iPo, and every set of n real numbers ai, • , a„ ,we have, for every n = 1,2, 


(5) 


S On h(6,) 


< c 


m ITS, 


(ii) For every ((> e^, we have || </> 1|, S Co, where Co is the g l.b. of the set of 
admissible constants Cm (5) 

(iii) If 9)1, is non-empty there is a unique 4 >o e 2)1, with || 4>o ||s = Co. Thus, 
00 IS the unique unbiased estimate of h which is best at &o ■ 

The non-constancy of h clearly implies Co > 0. 

The necessity of condition (5) is im m ediate. Suppose 0 e 9)1,, so that 0 satisfies 
equations ( 4 ); then, for any 6 i, 62 , • • • , 6 ^ , and any real numbers ai,ch, • • • ,a„, 


£ Oi hid,) 


c 

= / 0 S a, 


TTj, 


■dv. 


By the Holder inequality it follows that 




t-i 


^ II0 II.- 


n l 

a, TTj, . 

,=1 Ir 


Hence (5) is satisfied with C = 1| 0 |1. • 

Part ( 11 ) of the theorem is hereby proved as well. 

Suppose 9)1, non-empty, and 0 o, 0 i in 9)1., such that || 0 o H, = || 0i ll» = Co. 
Then 1/2 ( 0 o -f- 0 i) £ 3)1. and therefore 

1/2 j| 00 + 011|, S Co. 


But, by the Minkowski inequality, 

1/2 II 00 + 01 ||. ^ 1/2 (II 00 II. + II 01 II.) = Co , 


Hence 

II 00 + 01 ||, = 11 00 11. + 11 01 11. ■ 


This equality implies 0 i = a 0 o for some positive a. But since the norms of 0 o 
and 01 are equal (and 5 ^ 0 ) a must be unity Thus the uniqueness of 0 o is proved. 



484 


B. W. BARANKIN 


It remains now to prove, assuming (5) satisfied, the existence of . Consider 
the functional F on ijJo defined by 

F(iri) = h{B). 

The Hahn-Banaoh theorem alluded to in section 1 (viz., [13, p. 55, Theorem 4]) 
has precisely (5) as a necessary and suflficient condition for the existence of a 
linear functional G on S, satisfying 

(a) OM = A(e), fl«0; 

(b) IlGll gC; 

where || (? |1 is the norm of <?, i.e., 

llGl| = l.ub. 1^. 

In particular, taking C = Co, there is a linear functional ffo on 8r with 

(aO GoM = h(e), 8 6 0 
(bO II Go II S Co . 

n 

But, for an element o, vn in the linear manifold [iPo] spanned by the ire , 

Go (53 iTj,) = S Oi h(dx), 

< I 

so that 


Co II ^ l.u.b. 


1 Go(^) 


Co 


tt[S3o] Ilf Hr 

Hence (b') is replaced by the precise statement 

(b") II Co II = Co. 

Now the representation theorem for linear functionals on S, asserts the exist¬ 
ence of « 8,, such that 


Co(f) « f dv, 
Jq 


and 


II "Ai ll« “ [| Co 11 = Co, 

This taken with (a') establishes the existence of i/>o * ?, satisfying 

f Ooifedv = h{9), 

Ja 

. II ^ II* Co. 

and this completes the proof of the theorem. 


9c 0 



•UNBIASED ESTIMATES 


485 


It is readily seen that 3K, will consist of more than just (jm if and only if there 
exists a non-null functional on 8, which vanishes on iPo. Our discussion in 
section 3 therefore enables us to assert the following. 

CoKOLLAEY 2-1. SOI,, when it is non-empty, consists of <j>!i alone if and only if 
SPo is fundamental in 8 ,. 

A word is in order concerning the following two consequences of the bounded¬ 
ness of the measure v: (i) if SPo C 8,, then also SPo c: 8r' for every r' < r-, (u) if 

e 8 , then also i^) e 8 ,' for every s' < s. Otherwise stated: (i') if iPo satisfies the 
postulate of section 2 for the number r, it like'wise satisfies this postulate for 
every (admissible) r' < r; (ii') if 9)1. is non-empty, then 9 )?,/ is non-empty for 
every s' < s. Regarding (i') we shall make only the obvious remark that although 
iPo satisfies the postulate for eveiy r' < r, there may be values of r' < r such 
that no C for (5) exists; this will be exemplified in section 9. Where (ii') is con¬ 
cerned, it is clear that the non-emptiness of Wl, will not necessarily imply that 
ipo C 8 . 7 ,'-i for every s' < s, even though for every such s' SO),' is non-empty. 
If for every <)>«0 other than 6a we have its i , for some particular s' < s, 
then we may have the situation in which there are elements in SOI,' with norms 
arbitrarily close to 0. However, this cannot be the case if (a) for some 6 other 
than Oo, re e 8.'/,'_i, and (b) h does not vanish identically on ©', the set of those 
6 for which re e ■ For, when these two conditions are satisfied. Theorem 

2 applies to h as defined on consequently there is a positive lower bound 
for the s'—^norms of the unbiased estimates of h over 0'. And since every ele¬ 
ment of 951,' is, in particular, an unbiased estimate of h over 0 ', it follows that 
the norms of those elements are bounded below by a positive number. 

6. The case s = « (r = 1). Let 9)1« denote the class of essentially bounded 
(v) unbiased estimates of h; and let bestness at 9o be defined 'with respect to the 
essential absolute suprema of the elements of this class. That is, the unbiased 
estimate 0 q , of li, is best at 8a if 

ess. sup. 10o(x) I < “, 

and if, when ip is another unbiased estimate of h, we have 
ess, sup. I (paix) I S ess. sup. | \. 

ita x«D 

The fundamental postulate for the functions re is, in this case, that iPo C 8i. 

Now, 8« , the space of essentially bounded, measurable (v) functions on Q, 
normed by ess. sup., is the space of linear functionals on 8i [14, p. 338]. Examina¬ 
tion of the proof of Theorem 2 will show that that proof goes through also in the 
present case in all but one detail: we cannot here in general prove the uniqueness 
of the beat estimate. The proof of uniqueness breaks down since the equality 

ess. sup. 1 (pai^) + <Pi{x) | = ess. sup. j <pa{x) \ -f ess. sup. | <piix) \ 



486 


E. W. BARANKIN 


does not imply that is a constant multiple of (/>o. Of course, if is fundamental 
in 81 , we have a fortiori the uniciuencss of the best estimate. 

The results for the case s = «> are then the following. 

Theorem 3. (i) ^4 necessary and sufficient condition that he non-empty is 
that there exist a constant C such that for every set of n functions its, , t*, , • • • , irj,, 
in ipo, and every set of n real numbers m, Oj, • ■ • , Urt, wc have, for every n = 1, 
2 , • ■ • , 

52 u. hid,) ^ G j 22 ’Tj, 

i-1 1 <-l 

(ii) For every <l> eS!3?M we have |1 ^ ||m ^ Co, where Co is the g.l.h. of the set of 
admissible constants C above. 

(iii) When is non-empty, ii contains elements with norm equal to Co ■ These 
are the best (at 60 ) unbiased estimates of h. When iPo is not fundamental in Si, 
there need not exist a unique best estimate. 

We close this section with the remark that Theorem 1 remains valid, as it 
stands, in the case s = 00 . 

6 . Particular lower bounds for the minimum s.a.c.m. In order to stress their 
significance in the statistical context, wc shall give the statements of this section 
with the help of the symbol <r,( 0 ) for the sth root of the s.a.c.m. of the unbiased 
estimate 0 , of h. We have of course, the relation 

= 11 </> 1 |. • 

Now, one of the most important aspects of Theorem 2 is that it presents us 
immediately with an explicit evaluation of the minimum for all <)> eM,. 
We state the formula in the foitn of a theorem. 

Theorem 4. Let 9L denote the set of all real numbers. Then, 

n 

gl.b. (r,(i^) = l.u.b. ^ a,,h{ 6 f) 

oiiaj, xiaMCm n 

2Li - 

t»l 

For brevity, let us set 

g.l.b. oU) = 

Since this theorem expresses as the l.u.b. of an explicit set of numbers, 
it is clear that the class of all lower bounds of 0 -“’'' is thereby thrown open to us. 
It follows that, when s = r = 2 and our hypotheses on ip are fulfilled, the classical 
lower bounds of Cram 6 r-Rao [3, p. 480] and Bhattacharyya [4, p. 3] are par¬ 
ticularized consequences of Theorem 4. In the results that follow here we shall 
indicate the deduction of those classical boimds. We need not, however, restrict s. 

For a moment, let us denote by 7 r(a;) the function on © which assigns the 
value 7 rp(a:) to the point p t%, and let © be an interval on the real axis. Then we 
shall, below, wi’ite its for the function (when it exists) on SI which assigns the 




UNBIASED ESTIMATES 


48/ 


value (d7r(x)/dp)p_9 to a: efl. Similarly, ir" for the function assigning the value 
(d ir(x)/dp^)p „5 to X] and so on. 

Theorem 5. Suppose ihe following conditions fulfilled: 

(i) @ = 3, an interval on the real axis; 

(li) k is differentiable on £ 3, 

(iii) for each d e@', nre is defined almost everywhere {v), and is an element of ; 

(iv) for each 9 e ©', 


lim 

p—*S 


Xp Wfi / 



= 0 . 


Then, for any m + n (m, n = 1, 2, • ■ •) points di, 6^, ■ ■ • , e„,in 3, and 9[, fla, 
• • ■ ,9nin ©', and any m + n real numbers ai, Oi, ■ • • , Om, &i, 62 , • ■ • , &n suchthat 


^ ai vb, + X) 




^0, 


we have 


( 6 ) 


min 

a. 




a,h{6f) + 52 b,h'{e',) 

t-I _t-I_ 

ffl n 

+ 1 : 

t"l f 


The prime on the h in ( 6 ) denotes the derivative of h. 


To prove this theorem, observe first that by virtue of Theorem 4, we may write 


min 


2 ciihiei) + 2 ^ 


h(p,) — h{e[) 


p, - 


£ Oi Ta, + £ ^^ 

4“1 P\ V\ 


for every set of points pi, Pa, • • ■ , p„ in ^ such that the denominator of the right- 
hand side is defined and 5 ^ 0. Therefore, also 


(7) 


^ lim 

an 


1-1 t-l Pi ~ Ot 


£ o.Ta. H- £ 6, ^ 


— n'i 


P. - e: 


Now, by condition (iv), the element 


£ 


a,xa. 


+ £&.^ 
1-1 p> 



of converges, in the strong s ense in Sr, to 

m n 

£ fliira, + £ h. iTa;, 

laal 1*-1 



488 


E. W. BARA.NK1N 


BA Pi $i, i = 1, 2, • • ■ , n. Consequently we have convergence of the norm; 
that 18 ) the denominator of the right-hand side of (7) converges to the denomi¬ 
nator of (6). (The latter is 9 ^0, so that for all pt sufficiently close to , t *= 
1, 2, • ■ • , n, the ratios in (7) are defined.) There is no difficulty about the 
convergence of the numerator of (7) to that of (6). The theorem is thus proved. 

CoBOLLABir 6-1. UTider the hypothesis of Theorem 5, m ham, in particular, 
when do e0' and jj irj, ||, 9 ^ 0, 


( 8 ) 


mitt 




tell/ 


If we denote by p the function on ft X 0 which assigns the value pt{x) to the 
point (*, 0 ), and write (8) in the form 


(80 


iV(go) I- 

- If 


(<rr'")' g f 

a log p 

r 

lu 

dO 



Pto dp 


i 


the generalization of the Cram^r-Itao inequality afforded by (8) becomes 
evident. 

Using the result and method of Theorem 6, we can establish the next in a 
hierarchy of theorems. 

Theohem 8. Suppose the hypothesis of Theorem 5 satisfied, and the foUounng 
condition ftdfiUedi for each 6 in a non-empty sxibact 0" of 0', (i) h"(6) (the second 
derivative) exists and (ii) t/ is d^ned almost everywhere (v), is an element of i,, 
and satisfies 


lim 

f-'t 



0 . 


Then^ for any m -h n + q(m, n, q ^ 1, 2, • • •) points di , 9j , ,0m in S, 

o'iiOi, ■ •' ,6n in and o" o'^ in 6", and any m-\- n + q real numbers 

Cl j Oj j '' * ) Cm ) hi j ta j * *' ) bn )Ci|C3) such that 


we have 



+ Cfir'Jj 


7-^ 0, 


min 

a. 


m 

OihiOi) -b hih'io'i) + i Cih"(,$i) 


S fli ^ b< *-»} + ^ C( ir't’ 

r 


Just as in the case of the previous theorem, we have here an immediate corollary. 

Corollary 6-1. Under the hypothesis of Theorem 6, we ham in particular, when 
tfoe0'-0", 


. IbhW + cA^^(9,)| 

II hirjo -b CT»J I, ’ 


( 9 ) 



■UNBIASED ESTIMATES 


489 


for any two real numhers, b and c, such that the denominator of the right-hand side 
does not vanish. 

Consider, (9) in the particular case s = r = 2. In this case, (9) may be 'written, 
explicitly, 



In particular, (10) holds for values of b and c which maximize the right-hand 
side. And that maximum value is foimd, in the usual way, to be 

+ J^[h"(e,)]\ 

where the matrix 



is the inverse of the matrix 

Ja pi \dd ) Ja pi dd dS^ 

ridpd^p, fi/d^pV. 

Japidd dB^ Japi\de^/ 

Thus, we have 

(11) (v?'”)* ^ J^'iBo)]^ + 2J%'iei)h'\ei) + rw'm\ 

This is seen to be Bhattacharyya’s result for the case of derivatives up to second 
order. 

It is obvious ho-w we extend Theorem 6 to obtain a similar result involving the 
functions n, ir'i , wi , • • • , ttJ"’, for any assigned n. And it is thereafter clear 
how, in the case s = r = 2, Bhattacharyya’s general inequality may be deduced. 

It is clear that we can proceed from Theorem 4, under suitable conditions, 
to lower bounds for (r?‘“ which involve integrals of the functions n{x) (and the 
corresponding integrals of h) as well as the derivatives of these functions. 

In closing this section we note that all the above considerations apply equally 
to the case « = w. 

7. Determination of the best estimate. We shall now prove the following 
theorem, which provides an explicit construction of the best (at flo) estimate of h. 
We repeat that s is now taken to be finite. 

Thbobbm 7. Let 9Jlj 6e non-empty, and be the best (ai Bo) unbiased estimate of h. 
Let [6i, i = 1,2, ••• ,kn],n = 1,2, ••• ,bea sequence of (finite) sets of points of 
0, and (a?, f “ 1,2, • ■ • n = 1, 2, ,a sequence of sets of real numbers, 
such that 

i:«?M0r) 

lim - = Co = = v“'". 

'V' n » 



490 


B. W. BAHANKIN 


Then the functions • 




rn(aO = 




2 


J I ^ \rlM / K \ 

r, • Z) «? ^ 0 : (a;) fign ( 2 af x,? (a;)) 

I 11-1 I \i-i / 


(are elements of C, and) conoergc strongly in V, to <h- 
The strong convergence here means precisely that 


lim j 1 fn 1* civ = 0. 

TI-* w w n 


Clearly, we may, with no loss in generality, assume the numbers a," to be 
such that 


( 12 ) 


k„ 

2 

1-1 


= 1 , n = 1, 2, 


We shall suppose this to he the case throughout the proof. Then the essential prop¬ 
erty of the fl," and the a." is that 


(13) 


lira 


2«rii(9T) 


1 t-i 


Ca. 


And in this normalized situation, the functions fn will be given by 


(14) 




Ux) = Z«rh(&?)- 


1-1 


S a" ir,;(a:) Bgn a" ir*?(a:) ). 


That these functions are elements of 8, is easily seen; in fact, 


IMI.= 


Ea?M0c) . 


The proof of this theorem will consist mainly in the application of the following 
two lemmas. 

Lemma 2. Let 0 «8,, and (fn , n = 1, 2, ■ ■ •) he a sequence of functions in 

8r such that 

(i) 11 11, = 1, ft - 1, 2, . ■ ■ 

(ii) lim / fnijcfv = II jj II,, 

U-*ao J n 

Then converges strongly in 8, to the function 

1 


fo = 


ihir; 


TTr I V sgn 7j. 


Let us observe first that 



UNBIASED ESTIMATES 


491 


and 


lUo|| = l. 

Furthermore, fo is the unique element with norm ^ 1 in Or having the property 
(15). For, if also, 

f foTjdp = jjlj II., 11 II, g 1, 

« n 

we then have 

f + ?o)-») dv = 11 1|, ; 

j n 

and from this, 

2 11 fo + Hr 11 n 11. ^ 11 »! 11. . 

That is, 

11 ^0 + Hr S 2 ^ 11 $0 Hr + 11 fo Hr ■ 

From this, and (Minkowsla) 

II fo + fo Hr S II fo Hr + II ^0 Hr ) 

we have 

II fo + ||r — II fo Hr + H fo l|r • 

Therefore, for some a > 0, = a^o ■ But we must have a = 1 if fo and are 

both to satisfy (15), as assumed. Hence fo = Jo. 

Now consider the sequence (Jnj. Choose a sub-sequence {J„j that converges 
weakly to, say, J'. Then H J' Hr ^ 1. We have 

f ^'r]dv = lim [ Ijdy = H’jIH- 

Jfi i-»oo Jn 

Hence, J' = Jo • And since 1 = || IH —^ 1 = lUo Hr , it follows that J„. converges 

strongly to Jo (cf. [13, p. 139, section 3]). 

Suppose there is a subsequence {j„,} of [J„} such that 

11 “ €o II > > 0, f = l, 2, 

We have, nonetheless, for this subsequence, the hypotheses of our lemma 
satisfied. We can therefore apply the argument of the previous paragraph to 
extract a subsequence of { j(., j , which converges strongly to Jo This is in obvious 
contradiction to the above S-assumption, and the lemma is hereby proved. 

Lemma. 3. Lemma 2 remains true mth the roles of S, and S, interchanged. 

This IS obvious 

Returning now to the proof of Theorem 7, let us first, for the sake of brevity, 



492 


E. W. BARANKIN 


introduce the notation: 


From 


we easily obtain 


c. « S a? hiet), 

{ml 

Yn ® sgn 
^11 “ £ ar Tfi. 


f <^i<iTjdv = h(d)f OeQ, 
Ja 

/ 4>o<f'ndv = c^, n » 1, 2, ■ 
Jn 


which we may write 

/ 0o'7«^n dr « 1 c, I, n»'l, 2, 

•la 

Since | c„ | —> (I «^>o ||i (cf. (13)) and |l 7 ,\An |1, « 1, n “ 1, 2, ■ ■ • , (of. (12)), we 
have, by Lemma 2, that 7 r|/'« converges strongly to 

(10) ^0 = 1 (fm Sgn (#>o. 

The functions (cf. (14)) 

f n = C„ I {fn r^' Sgn 

obviously satisfy 

I U-yni^dv “= |c„|, n “ 1,2, 

And from this we conclude that 

lim f ^nfodv •=■ Co, 


or 


f = 1 = II Hr ■ 

n-T” ■'n I C„ I 

We may apply Lemma 3 to this result, since || fn/| c„ | ||r = 1, n = 1, 2,, 
And we thereby conclude that f„/| o„ | converges strongly to 

1 fo r'* sgn lAo, 



UNBIASED ESTIMATES 


493 


which, on substituting from the definition (16) of , we find to be just 


Co‘ 


Since [ c„ | —> Co, it follows immediately that converges strongly to 0o j and 
the theorem is proved. 

The following corollary is actually of greater use in applications than Theorem 
7 itself, for the reason that it leaves no doubt about the form of lim (i.e , 0o) 
when we know explicitly the form of lim 7 „ . 

Corollary 7-1. Assume the hypothesis of Theorem 7. Then the functions 



converge strongly, in ^r, to a function f/a, and 

(^ = Co I ^ 0 sgn ^0 • 


This is clear from the proof of the theorem. 

By way of illustrating the application of these results, we shall prove the 
following theorem 

Theorem 8. Assume the hypothesis of Theorem 5. And, further, let the equality 
sign hold in (8), Then, 

= w - fiir • I 7rJ,(a;)r''sgn tUx). 

iKsil l|r 

Since (8) is an equality, we may under the hypothesis of Theorem 5, consider 
that we have 


(17) 


Co = lim 


^ h(p„) - —^ h(9o) 


Pn — 


Pn — 


Pn — 




1 


Pn ^0 




where (p„} is a sequence in B converging to 6 o . The numerator of the right-hand 
side of (17), sans the vertical bars, converges to h'(6o) (which is ?^0, since 
Co 5 ^ 0); hence, for all sufficiently large n, that expression has the signum of 
A'(^o). The functions whose norms appear in the denominator of (17) we know 
to converge strongly in S, to tt*,, (by the hypothesis of Theorem 6). Hence, for 
this case, the function ito of Corollary 7-1 is 


4'o = 


sgn h'jeo) , 


Therefore, by the same corollary, 

I h'ie,) I 


sgn h'W 


'r/i 



494 


E. W. BARANKIN 


• sgn A'(ea)-8gn t», (a:) 


ne,) 


(x) x'‘ sgn 7ri,(a3). 


And this is the result asserted in the theorem. 

The reader will have no difficulty in establishing, in the exact pattern of the 
preceding proof, the following. 

Theorem 9. Assume the hypothesis of Theorem 6. And, further, let the equality 
sign hold in (9) for h = io, c = ca ■ ^ Then, 


<i>o{x) 


hoh'jBf)) -|~ 

II fcoTTflo + Coir/j ||f 


6oirJo(x)+ Co7rV|,(x) l"'^'- Sgn (toirsoCx) + CoiraJ (x)). 


It is evident that results of the type in the.se theorems may be built up as 
well wth integrals over the parameter space. 

A question of considerable practical importance is that of the rapidity of 
convergence of the fn to (^ . An answer to this question, on the level of generality 
we are maintaining in this study, consisla in relating this convergence to that 
of the 1 Cn I to Co. In the case s = r — 2, the answer is immediate and exact: 

[ (fn — 0o)* dv 

Ja 

f f'n dv — 2 I dv + f (j>l dv 
Ja Ja Ja 

I Cn 1 ° — 2 I C« I’ + Co 

CS - I Cn f. 

Thus, if one unbiased estimate is known, it provides, since its norm is ^Co, 
an upper bound for 11 fn — (^o 1 12 The same is true in the general case (any s) 
once we have established an upper bound, depending on Co and | c„ |, for 
II fn — <^0 lit. But in the general case, a good upper bound does not seem to be 
so close at hand. There are indications of the direction in which one must proceed, 
and we hope to draw some significant results out of these before long. 



8 . The case a = r = 2. The particular aspects of this case (whore bestness 
of an estimate has reference to its variance), which arise out of the coincidence of 

Sr and 8 ,, merit some discussion. We shall denote the inner product, / $17 dv, of 

Ja 

two functions f and i? in 82 , as usual by (t, 27 ). Let {iPo) denote the closed linear 
manifold in 82 spanned by the re. 

Theorem 10. Let Stli he non-empty. Then (l>o is the unique element of 3^2 which 
lies in {iPol. 

‘ In the case s = 2, 60 and Co are the values which render (11) an equality. 



■DNBIASBD ESTIMATES 


495 


To begin with it is clear that the functions of Theorem 7, in the present case 
s = r = 2, are all elements of [iPo], the linear manifold spanned by the its . 
Hence, ?ince 0o is the strong limit of these elements, <pQ e (iPo}. 

Now suppose also <j>i e'HJli, e {'>Poj. Then, from 

{<t>a , iTfl) = h{ 6 ), 0 e 

(<)i, its) = h{ 6 ), e e@, 

we have (</.i - , ir«) = 0, ' 0 e ©, 

and, by continuity of the inner product, 

(<i>i — <^o, ?) = 0, f e jiPo} ; 

that is, 01 — 00« But, from 0a€{ipol and 0i « {iPo! it follows that 

01 — 00 e {i]So). Hence 0i — 0o = 0, and this proves the exclusiveness of the 
property for 0o. 

Another characterization of 0o is given by the following corollary 
CoROLLABY 10-1. If is non-empty, then 0o is the unique element of S0?2 which 
satisfies the system of equations rn ^ : (0, f) = || f ||i, 0 e Sllla. 

To see that 0o has the asserted property, let <j> be any element of SWj, and set 
0 = f + n. with f « (iPo) and i? e {iPo}’*'. From 

(S, its) = to) = ( 0, ITj) = h{e), 

it follows that I«SWj. Hence $ = 0o. And so, 

(0, 0o) = (00 -f I), 0o) = 11 00 111 

If <i>i e 2122 has this property also, then both 

(01,0o) = II 00 111 

and 

(00,0i) = II 01 111; 

and therefore 

11 01 lls = 11 00 II2. 

This proves 0i = 0o , and so the corollary 

9. An example. Let n be Euclidean n-space, x = {xt ,Xi, • ■ • ,xf);p, Lebesgue 
measure; the set of real numbers; and 

And finally, let 60 = 0. Then 

Tre(x) = exp | S (-20a;, -|- 0“)|. 



4 % 


K. W. n.iUtSHJS 


If Q < I, < i aad we dfiJD** 
w® have, tot 


^(x) » f! — ‘Ji)"'* Ftp' h ^ — i, 

i, ‘"3 J 


f Mx)fkM >» Fxp", - - L 

Jci \3 ^ 

Thut, ^ w Mi unbwwst! atiimU* of th^ fanrUon hi 

his) « “ 1. 

If We exaniu® 

«*ii: ■ ®--«/.!‘‘ - “ 1 i (-5 £*!}*! 

we Rtttl that this integral conwngw only for s < i /2h. ShifLittf Use emphasis, 
m may atate: /o'" the Jutwlion h, tkfimd hy 

hi$) « >- 1. » > 0, 

fhar® exito <m‘ imhfaaid ft^imak imihfiuiUe *tb mtftnrnl oi ®» 0, for meh 

, ^ n 4” 2a 

•<-2„^ . 


Next, ob««f"vc Uiat 

IIri III “ (2i)"« L '"’'P (“I S 

»> exp II nr{r - I)«^|, 

90 that the»«“fP elements of for eaeh r > 1. The ratio 
!^J - _ 1) 0xp l-|n{r - 

is wen to diverge *, if 

I n(r *"!)<«. 

®®»ce, by Theorem 2, Utore exiate no unbiwiKl estimate of h belonging te 
iw a value of»surii fhat the number 



WsBea the inequality just above; that k, for a value of s greatm" than 

n 4- 2a 



UNBIASED ESTIMATES 


497 


Otherwise stated: ih^re exists no unbiased estimate of h with finite sth moment at 
e = 0, for 


s > 


n + 2oc 
2a 


It is most likely true that this last statement holds, in general, with 

^ /I "h 20! 


2a ■ 


We shall consider here only the case 

w + 2a 
2a 


= 2 ; 


and since the analysis is the same for every pair n, a satisfying this equality, 
we treat the particular case of 

w = 1, a = 

Thus, we shall show: /or n = 1, there exists no unbiased estimate of hi , 

h,{6) = - 1 , 

with finite variance at 6 = Q, 


We must show that the ratios 








are not bounded for all choices of m (distinct) ff'ia, and all sets of^m real numbers 
o<, and all m. This is clearly equivalent to showing the same for the ratios 


Q(m, = 


E 0.(1 - 6 ""’) 




Now we find, by direct computation, 

1 




s .,1-1 


And the solution of the familiar extremum problem: 


sup 

(Oi) 


Tf* H 

E 0.(1 - 


subject to E OtOj = 1 

.,j-i 


SUpQ’(wi, Oi, ffi) = E *'«(! — ® '*‘)(1 — 6 

(m) .-i”! 


yields 



49& 

E. W. lUIUNKIN 



where the matrix 


i,j = 1, 2, • ■ 

■ . 

is the inverse of the matrix 

u « 

i, 7 = 1,2, ■ • 

• , m. 

We now take, in particular, 

6i = it, 

f == 1, 2, • • 

• , m, 

where t is a positive numl)er. 

Clearly, there exists 

a number such that for 


i> h, 

U{t) = 

is non-singular. Also, 

lira 17(0 - I, 

l^nc 


the identity matrix. Then, for i > lo, V" = LT^ is a continuous function of 17, 
so that 

lira V(t) « (lira mO)-' « 7. 


Hence, 

It follows that 

and therefore, 


lira V{j{l) » Sij . 


lira sup Q^(m, a<, it) ** m, 

«-*«) (aO 


sup o<, dt) ^ m. 


(A simple argument on the characteristic values of U shows that there is actually 
equality here.) This result gives the unboundedness of the ratios Qj and our 
proposition is proved, by virtue of Theorem 2. 


APPENDIX 

The spaces and 8, are instances of a Banach space over the reals; that is, a 
complete, normed, linear vector space, closed under multiplication by real 
numbers, That the space, say 58, is normed is to say that there is a non-negative, 
real-valued function, || ||, defined on S8, with the properties: 

II £ (I = 0 if and only if £ is the null vector, 

II afli = I a| ' lull, 

II f + ’z II ^ II £ II + lU II; 

where £, t; «58 and a is real. The number || £ || is called the norm of £. 



UNBIASED ESTIMATES 


4t>9' 


The function 1| J — i; || on pairs ri of vectors is a distance function in the 
usual sense. With it, strong convergence (or simply convergence) is defined in S8: 5^ 
converges strongly to ^ when liin 1| 111 = 0. In symbols: -> f or hm fn = 

n-»w 

The usual set-theoretic notions are now defined in the obvious way; e.g., limit 
point of a set, elosed set, etc That the space 18 is complete means that every 
sequence {^nj satisfying lim || — fn I1 = 0 converges to a (unique) element 

mrn“*oo 

€ effl. 

A linear manifold 9}i in 18 is a subset of 18 with the property that for any two 
elements g, tj e SK and any two real numbers a, b, we have also + brj e 311. 
A closed linear manifold is a linear manifold that is closed in the set-theoretic 
sense. If S is any subset of 18, then the set, [S], of all fimte linear combinations of 
elements of <5 is a linear manifold, it is the linear manifold spanned by S. The 
closure of [jS], denoted by {>S), is called the closed linear manifold spanned by S. 
In general, [>S] is a proper subset of (5^). A set S C 18 is called fundamental 
when {*S} =18. 

A linear functional, G, on 18 is a real-valued function with the property 
that for any two elements f, t; « 18 and any two real numbers a, b, we have 
G(af -b by) ~ aG(^) + bG(y). The linear functional G is said to be bounded when 
the number 


\\G\\ 


l.u.b. 


|(?(?) 

lull 


is finite. 1| (? || is called the norm of G. (Throughout the text of the paper, the 
qualification “bounded” has been understood in all references to linear func¬ 
tionals). If we define the sum of two linear functionals F and 0 by {F -f- 0) 
(f) = ^'(^) -b (?(?), and make the other requisite definitions in the obvious way, 
we find that the bounded linear functionals on 18 form a linear vector space 
over the reals. The function || || on the bounded linear functionals, which we 
have already called a norm, is in fact a norm in the Banach space sense. This 
vector space, so normed, is readily shown to be complete Hence it is a Banach 
space—usually called the conjugate space to 93. It is this space we have referred 
to in the text as the space of linear functionals on^. 

If a sequence (f„] of elements of 18 has the property that hm (?(f„) = G(f) 

for every bounded linear functional G, then is said to converge weakly to 
If, of the sequence {?„), wo know only that lim (?(£„) exists for every bounded 

linear functional, we say simply that the sequence is weakly convergent. The 
space 18 is called weakly complete if every weakly convergent sequence converges 
wealcly to a limit. The spaces 8,, r ^ 1 are weakly complete 18 is said to be 
weakly compact if every bounded set 5 Cl 18 contains a weakly convergent 
sequence. That S is “bounded” means 11 f 11 ^ ■ 

A real Hdbert space $ is a real Banach space on which there is defined an 



600 


E. 'W. BAIUNKXN 


inner product; that ia, a function ({, t?) on pairs of elementg f, 17 , with the 
properties 


(f. n) =° iv, f), 

(of, rj) - o({, 1 }), 

+ f> >;) “• fl) + (fi»?), 

The inner product is a anUinium function of both its arguments; i.e,, lira fm == f 
and lira ijb «= i? imply lira (t„, i}„) = (f, tj). The space Sj in the text is a Hilbert 

space when we take (f, v) (v dv. Two elements t, 77 which are such that 

Jq 

(f, 7 )) = 0 are said to bo ortliogonal. If »S is any set in then the set of elements 
of $ each of wluch ia orthogonal to every element of S is called the orthocomple- 
ment of S, and is denoted by 

For further elaboration the reader is referred to [13J and [19]. 

BJEFERENCES 

[11 D. Blackwbu:, and M. A. Girbhick, “A lower bound for the variance of some unbiased 
sequential estimates,” Annala of Math. 8tal,, Vol. 18 (1047), pp. 277-280. 

[2| 11. A. FwitER, "Theory of Blatlstioal estimation," Camb, Phil, Soc. Proc., Vol. 22, 
(1026), pp. 700-726. 

[31 H. CaaMiiR, MaXhemaHcat Malhodt of iStotfatfcs, Princeton Preee, Princeton, 1946. 

[41 A. BnATTACHARTtA, "On Some analogues of llio amount of information and their use 
in Btatistioal estimation," SoTvfchj/S, Vol. 8 (1946), pp. 1-14. 

[6] M. A. Girsbick, F. Mostbomb, and L. J. Sayaok, "Unbiased estimates for certain 
binomial sampling problems,” Annala of Math. Slat., Vol. 17 (1946), pp. 13-23. 

[6] J. NKrMAN, "Su un teorema oonoarnente le cosidetto statistiohe sufEcienti,” Qiornale 

dell’Istitulo Italiano degli AUuari, Vol. 6 (1936), pp. 320-384. 

[7] D. BLACKWBiiL, "Conditional expectation and unbiased sequential estimation," 

Annala of Math. Slat., Vol. 18 (1947), pp. 106-110. 

[8] H. Oram^b, "A contribution to the theory of statistical estimation,” Skand. Aktuar. 

tida., Vol. 29 (1940), pp. 85-94. 

[9] A, BhaTtacharyta, "On some analogues of the amount of information and their use 

in statistical estimation (oont’d),” Sankhyd, Vol. 8 (1947), pp. 201-218. 

[10] J. WonvowiTi!, ‘•'The eflioienoy of sequential estimates and Wold’s equation for se¬ 
quential processes,” AnTials of Math, Slat., Vol. 18 (1947), pp. 216-230. 

[Ill F. Ribbz, Lea Syalimea d’Mquationa Lintairaa a una Ivfiniti d’Inconnuea, Gauthler- 
Vlllars, Paris, 1018. 

[121 *’• RtBSiB, "Unterauohungen Ubor Syateme integriorhare Punktlonen,” Math. Annalen, 
Vol. 69 (1010), pp. 449-497. 

[13] S. BanaOh, Thiorie daa Opiraiiona Lintairaa, Garasinski, Warsaw, 1932. 

[14] N. Dunjord, “Uniformity in linear spaces,” Atji. MoiL Soc. Trana., Vol. 44 (1988), 

pp. 805-366. 

[15] M. H. Stbinhaus, "Additive und stetige funktioDBloperatioDen,u Math. Zeila-, Vol. 

6 (1918), pp, 186-221. 



UNBIASED ESTIMATES 


501 


[16] F. Riesz, “Sur une espfece de g4om6trie anajytique des systftmes de fonctions som- 

mables,” Comptes Rendus, Vol, 144 (1907), pp. 1409-1411. 

[17] M. Pb^chet, “Sur les ensembles de fonctions et les operations lineaires,” Comptes 

■Rendus, Vol 144 (1907), pp. 1414^-1416. 

[18] S. Saks, Theory of the Integral, Steohert, New York, 1937 

[19] J. VON Neumann, Functional Operators, (Mimeographed notes) Princeton, 1935 

[20] G. B. Seth, “On the variance of estimates,” Annals of Math. Stat , Vol. 20 (1949), 

pp. 1-27. 

[21] B. J Pettis, "A note on regular Banach spaces,” Am. Mott. iSoc Bull., Yol 44 (1938), 

pp. 420-428. 



A SEQUENTIAL DECISION PROCEDUEE FOR CHOOSING ONE OF 
THREE HYPOTHESES CONCERNING THE UNKNOWN 
MEAN OF A NORMAL DISTRIBUTION 

By Milton Sobel and Adhaham Wald^ 

ColumMa University 

1 . Introduction. In this paper a multi-decision problem is investigated from 
a sequential viewpoint and compared with the best non-sequential procedure 
available. Multi-decision problems occur often in practice but methods to deal 
with such problems are not yet sufficiently developed. 

The problem under consideration here is a S-decision problem: Given a chance 
variable which is normally distributed with known variance a®, but unknown 
mean d, and given two real numbers ai < os, the problem is to choose one of the 
three mutually exclusive and exhaustive hypotheses 

Hi:9 < ai Hi : ^ 6 ^ Ch Ih 9 > Oi. 

In order to select a proper sequential decision procedure, the parameter space 
is subdivided into 5 mutually exclusive and exhaustive zones in the following 
manner. Around aj there exists an interval (0i, Sj) in which we have no strong 
preference between Ih and Ih but prefer (strongly) to reject /fu. Around aj 
there exists an interval (fla, Si) in which we have no strong preference between 
Ih or Ih but prefer (strongly) to reject Ih , For ^ wo prefer to accept Ih . 
For 9i S 9 ^ diV/e prefer to accept Ih • For 0 0i wo prefer to accept Ih ■ 

The intervals (5i, flj) and (flj, 64 ) will be called indifference zones. The de¬ 
termination of these indifference zones is not a statistical problem but should 
be made on practical considerations concerning the consequences of a wrong 
decision. 

In accordance with the above wo define a wrong decision in the following 
way. For 9 S 9i, acceptance of Hi or Hi is wrong. For 9i < 9 < 9i acceptance of 
Hi is wrong. For 02 9 ^ 9i, acceptance of Ih or Ih is wrong. For 9i < 9 < 9i, 

acceptance of Hi is wrong. For 9 acceptance of Ih or Ih is wrong. 

The requirements on our decision procedure necessary to limit the probability 
of a wrong decision are investigated. Two cases are considered. 

Case 1 ; Prob. of a wrong decision g 7 for all 9. 

Prob. of a wrong decision ^ 71 for 9 ^ 9i, 

Case 2 ; <Prob. of a wrong decision ^ 7 a for 9i < 0 <. 9i, 

Prob. of a wrong decision ^ 73 for 9 9\. 

The decision procedure discussed in the present paper is not an optimum 
procedure since, as will be seen later, the final decision at the termination of 

‘ Work done under the sponsorship of the Office of Naval Research. 

€02 



A SEQUENTIAL DECISION PROCEDURE 


503 


experimentation is not in every case a function of only “tke sample mean of all 
the observations”, although the sample mean is a suflBcient statistic for 6. Al¬ 
though the procedure considered is not optimal it is suggested for the following 
reasons: 

1. The decision procedure can be carried out simply In fact tables can be con¬ 
structed before experimentation starts that render the procedure completely 
mechanical. 

2. The derivation of the operating characteristic (OC) function, neglecting the 
excess of the cumulative sum over the boundary, is accomplished with little 
difficulty. In general, for other multi-decision problems it is unknown how to 
obtain the OC function. 

3. It is believed that the loss of efficiency is not serious, i,e, the suggested 
sequential procedure is not far from being optimum. In this connection a non¬ 
sequential procedure is compared with this sequential procedure. The results 
show that, for the same maximum probability of making a wrong decision, the 
sequential procedure requires on the average substantially fewer observations to 
reach a final decision. In fact, for Case 1 noted above, if 008 <y < .1, and if 
certain symmetrical features are assumed, then the fixed number of observations 
required by the non-sequential method is greater than the maximum of the 
average sample number (ASN) function taken over aU values of 9. 

It was found necessary m the course of the investigation to put an upper bound 
0 _ 0 

on the quantity — - - in order that the methods used to obtam upper and lower 

flj — Ul 

bounds for the ASN function should give close results. This restriction, however, 
is likely to be satisfied in practical applications 

All formulas for ASN and OC functions which will be used in this paper will be 
approximation formulas neglecting the excess of the cumulative sum over the 
boundaries Nevertheless, equality signs will be used in these formulas, except 
when additional approximations are involved. 

2. Description of the Decision Procedure^ We shall assume that the indiffer¬ 
ence zones described above have the following properties 

(i) 6 i <1 tti ^2 ^ 0 a < U 2 04 

(ii) 01 4 " 02 = 2 ai ; 03 + 0 i = 2 o 2 

(iii) 02 - 01 = 04 - 03 = h (say). 

* A Biimlar decision procedure was used by P. Anmtage [2] as an alternative to the 
sequential t test (with 2-Bided alternatives) The form used there Is more restricted as he 
considers only the case 02 = 0j . Essential inequalities on the OC function are p'ointed out 
but no attempt is made to determine the complete OC and ASN functions. A closely related 
but somewhat different procedure for dealing with a trichotomy was suggested by Milton 
Friedman while he was a member of the Statistical Research Group of Columbia University 
As far as the authors are aware, no results were obtained concerning the OC and ASN func¬ 
tions of Friedman’s procedure. 



604 


MILTON SOBEL AND ABRAHAM WALD 


Let Ri denote the Sequential Probability Ratio Test for testing the hypothesis 
tliat fl = 01 against the hypothesis that 0 == 0a. We a^ume for the present that 
either the proper constants j 4, B in the probability ratio teat are given or that 
they are approximated from given a, /S by the relations 

. . 1 - /3 « P 


B 


1 - 


Here a and /3 are upper bounds on the probabilities of first and second types of 
errors, respectively. 

Let represent the S.P.R.T. for testing the hypothesis that 0 « 0a against the 
alternative that 0 = 0i. For this test we assume that (a, A, B) are replaced 
by (a, A, B) and as above that either A and B are given or that they are 
approximated from given a, 

The deoision procedure is carried out as follows: 

Both Ri and Rt are computed at each stage of the inspection until 

Either; One ratio loads to a decision to stop before the other. Then the former 
is no longer computed and the latter is continued until it leads to a decision to 
stop. 

Or; Both f2i and Rt lead to a decision to stop at the same stage. In this event 
both computations are discontinued. 

The following table gives the rule R for the decisions to be made corresponding 
to all possible outcomes of Ri and Rt. 


Ri 


Ri 


R 


If 

accepts 01 

and 

accepts 0s 

then 

accepts Bi 

If 

accepts 0j 

and 

accepts 0j 

then 

accepts Ht 

If 

accepts 01 

and 

accepts 04 

then 

accepts Ht 


We shall show that acceptance of both 0i and 04 is impossible when (.1, S) = 
(A, B). For this purpose we need the acceptance number and rejection number 
formulas. (See page 119 of [1]). 

Acceptance Number Rejection Number 

2 n 3 

El: log B + Oi n < T) X* < “ log A + Oi n 

A ° tTi A “ 

^2 n 1 

Ej; - log B + Otn < Z) *» < r log A + osw. 

a A 

We shall assume/or convenience that “between observations” Ei is tested before 
Ej and let the term “initial deoision” refer to the first deoision m a de. 

Assume 0i and 04 are both accepted. Then if 0i is accepted initially at the mth 
stage 




A. SBQUENTIAIj decision pbocedure 


505 


Since 

tr^ (T* 

- log B + Oi 7W < - log B + 02 TO 

it follows that di is rejected at the same stage, contradicting the hypothesis 
Similarly if dt is accepted initially at the mth stage, then 

m 2 

Xa ^ ^ log 4- ffj m. 

a-l A 

Since 

2 2 

- logA + 02 m > — log 4. + OiTO 

it follows that di is rejected at the same or at an earlier stage, contradicting the 
assumption that the acceptance of dt is an initial decision. Hence di and ^4 cannot 
both be accepted. 

A geometrical representation of the rule R is given in Figure 1. 

R can now be described as follows: Continue taking observations until an 
acceptance region (shaded area) is reached or both dashed lines are crossed. In 
the former case, stop and accept as shown above. In the latter case stop and 
accept H 2 . 

The proof above that di and di cannot both be accepted consists of noting that 
a point below the acceptance line for 0 i is already below the rejection line for 
6 i and that a point above the acceptance line for 64 is already above the rejection 
line for fli. 

If a, S) ^ (A, B), a necessary and sufficient condition for the impossibility 
of accepting di and di is that at n = 1 the following inequalities should hold. 

Rejection Number (of di) for Ri ^ Rejection Number (of Oj) for Ri 

and 

Acceptance Number (of dt) for Rt ^ Acceptance Number (of d,) for Ri. 

In symbols 

^ log A + ai ^ ^ log + 02 
and 

2 2 

^ log B + Oi ^ ^ log B + 02 . 

These can be written as 

and | ^ 

A o 


respectively, where d = 02 — Oi. 



50G 


MILTON SOBEL AND ABBAHAM WALD 


Since ^ > 0, the above inequalities are certainly fulfilled when 
cr^ 

, . B , A 

(2.1) S ^ ^ ^ 

In what follow in this paper, we shall restrict ourselves to cases where accept¬ 
ance of both 01 and 04 is impossible, even if this is not stated explicitly. 



FiauR® 1 

3. Derivation of OC Functions. Let L{fli \6, R) denote the probability of 
accepting J?, when 0 is the true mean and R is the sequential rule used. Let Ht{ 
denote the hypothesis that 0=0,. Since, as shown above, Hi is accepted if and 
only if 01 is accepted, we have 

(3.1) L(ffi 1 0, i?) = L{He, I 0, B,). 

Similarly, 

(3.2) 


L(H, I 0, S) = 1 0, R,). 



A Sequential decision procedure 


507 


From the fact that Ri and R 2 each terminate at some finite stage with prob¬ 
ability one, it follows that R will terminate at some finite stage with probability 
one. Hence 


(3.3) L{H 2 \e,R) = 1 - UHi \e,R) ~ L{H, [ 9, R). 
From pp. 50-52 of [ 1 ], the following equations are obtained. 

(3.4) L(Hi I 2?) = Uff,, I e, Rd = 
where 


.and 
(3 5) 
where 


hj = hi(6) = 


^2 ~i“ — 26 

62 — 61 



L(HeAe,R2) 


- 1 


/12 ” ^2(9) — 


64 “b 63 — 29 
9i — 63 


02—0 


A 

2 


These equations involve an approximation, as explained m [1] 
Hence 

(3.6) LiH, \9,R) = m,, I 0, ft) = 1 - L(H,, | 9, ft) = 


and 

(3.7) L{H 2 1 0 , B) = 1 


^*1-1 1 - 5 '*“ _ 1 _ 1 - 

_ 5*1 ^*2 _ 5*2 ~ ^*1 _ 5*1 ^hi _ ^(12 • 


Since L(Hi \ 9, R) = L{Hei 1 9, Ri), it follows that L{Hi \ 9, R) is a mono- 
tonically decreasing function of 9 and that 

L(Hi|-00,2?) = 1, L(22i| 00,5) = 0 

L(Hi I 01,5) = 1 - a; L{Hi 102, 5) = ,3 


L(22i|ai,5) 


log^ 

log A -I- I log 51 ■ 


Similarly, since L{H 3 [ 0, 5) = 1 — L(He, | 0, R 2 ), it follows that L(f2» | 0, R) 
IS a monotonically increasing function of 0 and that 


L(H3l-«,2?) = 0; L(2?3| 00 , 2 ?) = 1 



508 


MILTON SOBBL AND ABKAHAM WALD 


Since L(Hj | 0, iZ) = 1 — L(J/i | 0, iZ) — L{Jh | d, R) it follows easily from the 
above results that 


Ulh I - fi) = 0; Lilli 1 00, iZ) « 0 
L{Hi\B, R) <a for 0 < 0i; LiHt\d, R) < ^ for 8 > 0* 

bgX^lligTl “ “ ^ \ai,R)< 

iogl“+ilogTl - ^ ^ 1 05 .-RX bi"j+TiSil] 

1 - |3 - a < L(Hi 10, iZ) < 1 for 6iS8 ^ Oi. 


4. Probability of Correct Decision. Denote the probability of a correct 
decision by L(0/iZ). It is defined as follows; 


Interval 
0 g 0t 

01 < 0 < 0j 

01 S 0 ^ 0a 

0a < 0 < 01 

01 g 0 


Correct Decisions 
acceptance of Hi 

acceptance of Ih or Hi 
acceptance of Hi 
acceptance of Hi or Hi 
acceptance of Hi 


Li8\R) 

L{Hi 1 0, R) 

LiHi 1 0. 2Z) + LiHi 1 0, R) 
L{Hi I 0, R) 

LiHi I 0, R) + LiHi I 6, R) 
LiHi I 0, R) 


It should be noted that at points of discontinuity, L(0, | 7Z) is defined os the 
smaller of the two limiting values. 

We shall now discuss some monotonicity properties of the function L(0 j iZ). 
From the fact that | 0, Ri) and L(Ht, \ 0, iZj) are continuous with con¬ 
tinuous first and second derivatives and are monotonically decreasing for all 
0 with a single point of inflection in the intervals 0i < 0 < 0s and 03 < 0 < 0i 
respectively, it follows that 

(i) I((01 7Z) is monotonically decreaaing with negative curvature for — w < 
0 g 01. 

(ii) 1.(01 2Z) is monotonically increasing with negative curvature for 0i g 

0 < oo. 

Making use of (3.3) we have further 

(iii) 1/(0 I fZ) IS monotonically decreasing with negative curvature for 0i < 

0 < 0j. 

(iv) 1/(01 R) is monotonically increasing with negative curvature for 03 < 
0 < 04. 

(v) For 03 £ 0 ^ 03 , ~ L(0 I 7Z) = -[i LiHi | 0, iZ) + 1 LiHi \ 0, fZ)] is 

d d 

decreasing, since ~ LiHi | 0, 2Z) and ~ LiHi \ 6, R) are increasing. In 
other words L(0 | iZ) has negative curvature for 02 ^ 0 ^ 03 . 



A SEQUENTIAL DECISION PROCEDURE 


509 


In the special CRse when A = A = ^ and the origin ia taken at — 

for the sake of convenience, it is easy to see that L (6 [ fi) is symmetric with 
respect to the origin and, because of (v), has a local maximum at 9 = 0 . 

6. Choice of the constants A, B, A, & to insure prescribed Lower Bounds 
for Z ((0 I E), We shall deal here with the question of choosing A, B, A and 6 
such that L( 6 » 1 Z 2 ) ^ 1 - yi when g fli, L(e 122 ) ^ 1 - 7 a when di< 6 < 64 , 
and L{ 6 \R) ^ 1 — 73 when fl ^ ^ 4 . From the monotonic properties of the correct 
decision function it is only necessary to insure that 

(5.1) L(di I 22) = 1 — 71 , L(fl 2 1 22) = L{6i | 22) = 1 — 72 and L[6i 122) = 1 — yi. 
The following relations will be needed; 

hiiOi) = h^idi) = 1 = — hiidi) = 


hiidi) 


83 "h ^4 — 202 _ 2 

A ^ 

2 


r (say) 


hiiSj) 


61 62 — 2^3 

_ 



—r 


where d = 04 — 02 = 03 — = 02 — oi • 

The following four equations are obtained from (5.1): 

( 5 . 2 ) 1 - LiH^ 1 01 , 22 ) = L{H,, | e„ 22 ,) = = 71 

1 - L(22a I 0a, 22) = L(H^ 1 02 , 22) + LiH^ 1 da, 22) 

^ B(A - ^) ^ 

A - B U-- - B'j ^ 

1 - UH 2 103 , 22) = L(223 1 03 , 22) + L{H, 103 , 22) 

(5.4) 1 - ^ ^ [B'ir - 1)1 

= r^B^l-A^r-\ = y^ 

(5.5) 1 - L{H, I 04,22) = L(H,, | 04, 22a) = = Ts- 

The "bracketed terms" represent quantities less than a and 5 respectively and 
if r is sufficiently large they can be neglected. This will be made more precise 
but first let us note the results of neglecting the bracketed terms. 

From (5.2) and (5.3) we obtain 


(5 6) 



5(1 — 71) = 72, whence B = 



510 


MILTON SOBEL AND ABRAHAM WALD 


Prom (5.2) and (5.6) 
(5.7) 


A = - whence A = 


■ 71 


71 


Since the last two equations are obtained from the first two by the permuta¬ 
tion A B —> .S, 7i —> 7a, 73 -+ 7 j j we have 




78 


1 — 7j 


A = - 


1—73 


72 


1 1 1-7 

If 71 = 72 = 73 = 7 (say) then A=A = ^ = g = — ~ . 

We shall consider the bracketed quantities negligible if the result of neglecting 
them produces a change of less than 20% in [1 — L(6 | i2)J at £> = Oi, 63 re¬ 
spectively, i.e., if 


(5 8) 

and 

(5.9) 


1 - B' 


1 


-(i^J 


72 

5 


i' - ( I - 7 3Y' / 78 Y 5 

\ 72 / \1 - 73/ 


- (i^.J 


- B 


72 

5 


Inequality, (5.9) can be written as 

72[(1 - 72 )’’ — 7i] 


(1 — 72)'^(1 — 71)’^ — 7i 




72 


or 


(1 - 72 )'’ 

This will certainly hold if 

or if 


72 - ? (1 - 7i)^ 


S (7172) 


'0-1’) 


75g^^(i- 7.r 


( _IL_Y < l! 

Vl - 71 / ' 5 • 

Assume that 71 , 73 and 73 are each less than I, Then the last inequality can be 



A SEQUENTIAL DECISION PROCEDURE 


511 


written as 


(5.10) 



Starting with (5 8 ) the same relation is obtained except that 71 is replaced by 
73 , namely 


(5.11) 


r g 


log — 
72 


log 


1 — 73 
72 


Let 


k = 


1 ^ 



where 7 is the larger of 71 and 73 . Then k is the larger of the right hand members 
of (5.10) and (5 11). Then for (5 8) and (5.9) to hold it is sufficient that 

r ^ k. 


2 

If = .05 and 0 < 71 , 7 a < -1 then k is approximately = 1 54. If 72 = 01 

2 7 

and 0 < 71,78 < •! then k is approximately = 1.35. 

We shall now investigate under what conditions the approximate solution 
obtained above for A, B, A, B are such that acceptance of both di and Bi is 
impossible. 

It follows from (2 1) that the following pair of inequalities are sufficient for 
the impossibility of accepting both 61 and Bi ; 


(5.12) 


A 

A 


72 1 — Ti 


^ 1 ; 


B 

B 


72 1 — T2 


g 1. 


'7i 1 — T3 ’ B 73 1 — 7i 
If 71 7 ^ 7 a let the smaller and larger of the pair (71 , 73 ) be denoted by 7 and 
7 respectively. Since 1 — 2 > 1 ~ T^i then 

72(1 — 72 ) 72(1 — 72 ) 

7(1 - 2) 2(1 - y ) 


and we need only consider one of the two inequalities in (5.12). The condition 
72 < 2 will in general satisfy (5.12). More precisely if all the 7 ’s are restricted 
to the interval ( 0 , . 1 ) then 

9 1—72 1 — 72 . 10 

— < - — < - < — 

10 “1-2 1-7“9 


and it is sufficient for the validity of (5.12) that 72 ^ (.9) 7. 



512 


MILTON SOBEL AND ABRAHAM WALD 


K Yi = 73 = T (say) then the two inequalities reduce to one 

72 - 73 + 7 — 7“ ^ 0 

which can be written as 


(72 — 7) (72 — 1 + 7) 0 . 


Since the inequality 72 S 1 — 7 is impossible when all 7'3 are <§, wc see that 
7 s < 7 is sufficient for the validity of (5.12) when 71 = 73 =■ 7 < ^. 

There remains the problem of finding an approximate solution for equations 
( 6 . 2 ) to (5,5) when r < k. Since 

d ~ 2 Si — Ot ~ 


2 2 


we merely have to consider the interval 1 g r < A:. 
The following approximations are used 


(5.13) 


I - B 1 

^ ^ i 

i ~ ^ 1 


A -~B 

A' 


1 - ^ 

A' - A'' 


B{A ^) 
i - n 




B, 


which upon substitution yield 
(6.14) 

(6.16) 

(516) 

(6.17) 


A 

B 


7i 

T3 


4 - p = 72 

^ = 72 . 


Subtraction of (5.17) from (5.16) shows that B » — is a solution. Substituting 

A 

this result back in (5.16) leads to the equation 

(6.18) B + B' » 72 . 

It can easily be verified that between zero and unity this equation has exactly 
one root. Since 1 ^ r < «, the root of the above equation lies between ^ and 


72 • 

Taking 72 as a first approximation for B and substituting 72 + « for B in 

(6.18), We obtain 


e + (72 + <)' - 0 . 



A SEQUENTIAL DECISION PROCBDUEE 


513 


Expanding (72 4- e)*^ in a power series in c and neglecting second and higher 
order terms, the above equation gives 


Thus, 


72 


l + rvr^' 


(5.19) 



72 


1 + ryi~' 


72(1 + (r — 1)72 

1 + r7r 


It is necessary to investigate under what conditions the above approximate 
solution satisfies ( 6 . 2 ) to (6.5) to within a 20% error in [1 — L{e/R)],i.e., such that 


(5,20) 

1 

A 

(5.21) 


(5.22) 

1 

A 

(5.23) 

1 

A 


7l(1 

- B) 

1 - 

yiB 

73(1 

- B) 

1 - 

yaB 

B(l 

— 7i) 

1 - 

yiB 

B(1 

— 73 ) 

1 - 

73 B 


^ 7t 
-7, <5 


^ 73 
— 73 < -r 
0 

B'(l - 7D 72 

+ r-(yVBy ~ < 5 

, B'(l - 7 I) ^ ^ 70 

+ 1-- (7.B)' ~ < 5 


where for B the value in (5.19) is understood. 

It can be shown that if 71 , 72 , 73 , are each between zero and .1 then the 
inequalities (5.20) to (5.23) hold. Furthermore it can be shown that if, in addition 
72 ^ min ( 71 , 7 a) then also the inequalities ( 2 . 1 ) hold. The latter inequalities are 
sufficient to ensure the impossibility of accepting both 61 and 94 . 


6. Bounds for the ASN Function, First we shall derive lower bounds for 
the ASN function. Let E(n/ 6 , R) denote the expected value of n when 6 is the 
true mean and R is the sequential rule employed. For 6 < Ot the probability of 
coming to a decision first with Ri is large and therefore 

E{n/e, R) ~ E{n/e, R^) 9 < fe. 

From the definition of R it follows that 

E{nj9, R) > E{n/9, Ei) for all 9.' 

Hence Ein/9, Ri) serves as a close lower bound when 6 < Bi. 

Similarly 

E{n/e, R) ~ E{n/9, Rt) for 9 > Bs 

E{n/9, R) > E{n/9, Rt) for all 9 

Hence E(n/9, Rt) serves as a close lower bound for 9 > 93 , 

Combining the above we have 

(6.1) E{nj 6 , R) > Max lE(n/9, Ri) , E{n/9, fi,)| 



614 


MILTON SOBEL AND ABHAIIAM WALD 


where, neglecting the excess over the boundary, 

(G.2) E{n/6 Rx) = ^ UHn/d, Ri) log A 


ie - ax) 


(03) E{ti/d 22'') = ^ IJ(,II6^/0i 223 ) log A . 


~(e- a,) 

(Ts 


Porinula (6.1) gives a valid lower bound over the whole range of 0, but this 
lower bound will not be very close in the interval (flj, 63 ), particularly in the 

0 _L 0 

neighbourhood of the mid-point jr—The authors were not able to find any 

Jf 


simple method for obtaining a closer lower bound in this interval. The upper 
bound given later in this section will, however, be fairly close also in the interval 
(02, 63) and can be used as an approximation to the exact value. 

We shall now derive upper bounds for the ASN function. Let Rt be the follow¬ 
ing rule: "Continue to take observations until Rx accepts 0i.” Since this implies 
the rejection of di at the same or at a previous stage, it follows that R must 
terminate not later than Rt. Hence 


(6.4) 


E(n/0, Rt) ^ E(n/8, R). 


As a matter of fact one can easily verify that E{nfO, Rf) > E{n/0, R). Thus 
E{n/8, Rt) is an upper bound for B{n/d, R). This upper bound will be close 
when the probability of accepting Ox is high, i.o., for 9 £ Ox. 

By the general formula 


E(n)-= 



(see p, 63 [1]) we obtain, upon neglecting the excess over the boundary, 


(6.5) 


E(n/e, Rt) 


log B 

(0 - ai)' 


Tina coincides with (6 2) when L{Ho,/9, Ri) = 0. 

Similarly, if Rt denotes the rule of continuing until Rj accepts 9 *, then 

(6.6) E(n/d, Rt) > Ein/9, R) 

(6.7) E(n/0, Rt) -= . - 

-j (6 - Oi) 


and this will be a close upper bound for 0 § 04 . 

IfA = A = ^ = ^ and if oi -|- a 2 = 0 the above results reduce to 



A SEQUENTIAL DECISION PROCEDURE 


515 


(6.8) R) ~ E{n/d, RT) = for 6 ^ e^ 

(6.9) E(n/d, R) E{n/B, B?) = _A^ for 9 94 

p —* A 

where the symbol stands for a close inequality, and where 

tr^ 

/i = - log A and X = 02 = —ai. 


To establish an upper bound for the ASN function in the interval 8 ^ < 9 < 63 

we shall restrict ourself to the case where A = A = ^ = -|. These relations are 

£> a 

fulfilled by the approximate values of A, B, A, B suggested in section 5 when 
Ti = 72 = 73 and r ^ 1c. We shall choose the origin to be at , i.e., we put 

ai _ + aa ^ Q Then the vertex P of the triangle {Pi , P 2 , P) in diagram 1 lies on 


I 

the abscissa axis and OPi = OP 2 = h The abscissa of the vertex P is — = N (say) 

X 


N 

where X = oj = — oi Let y — represent the sum of the first N observa- 

tlons Let R 23 denote the rule: “Continue until both 8 ^ and 63 are accepted”. 
This is tantamount to neglecting the two outer lines m diagram 1, i.e , the accept¬ 
ance lines for 9i and 94 . Then clearly, 


(6 10 ) 


E{nle, Rii) > E(nld, R). 


When 9 lies between 82 and 63 this inequality will be close, since the probability of 
crossing either of the two outer lines is then small 
However E{n/9, R 23 ) was found difficult to compute and it was necessary to 

N 

consider instead the rule P 23 : “Take N observations. If y = 22 A, < 0 then 

i-i 

continue until 92 is accepted. If y > 0 then continue until 93 is accepted”.^ Clearly, 

(6.11) E{n/d, R 23 ) > E{n/B, ^ 23 )- 

This inequality, however, will be close only if the probability of concluding the 
test before N observations, given that 82 < 9 < 83 , small. 

Some investigations by the authors seem to indicate that the inequality (611) 
will be close when A < X. This inequality is likely to be fulfilled in practical 
problems 

We shall now proceed to determine the value of E(n/ 6 , R 23 ). Neglecting the 
excess over the boundary, we have 

( 6 . 12 ) E {n/ 8 , P 23 ,22 a:. = = ^ + ^ ^ ^ for y > 0 


a The event 2 / = 0 has probability zero and it is indifferent what rule la adopted for that 


case. 



616 


MILTON SOBEL AND ABRAHAM WALD 


and 

(6.13) E (n/e, ^ for y < 0 

where, for any condition C, Einfd, R, C) denotea the conditional expected value 
of n given that the true mean is 6, that R is the seciuential rule used and that the 
condition C is fulfilled. 

Multiplying with the density of y and then integrating with respect to y, we 
obtain after simplification 

(6.14) Bin/9, R'ad = [hX + 2m + 2<r 

»r g-Cv>/2) 

where (pix) = J dy, and di < 6 < 6 a. 

In particular, .for 0 = 0 we gat 

(6.16) B(n/e =• 0, Ru) ^ • 


To establish a close upper bound for < l ?4 we must bring the line of 

acceptance of $i into account. The line of acceptance of $i can be disregarded 
since the probability of accepting 61 is very small. 

We therefore define the rule Ru as follows: 

“Continue with Ri until 62 is accepted and with f?? until either 63 or 64 is 
accepted.” 

Since the ASN function for Rm is difficult to compute we define a modified 
rule Ru as follows: 


“Proceed to take N 

li 



observations without regard to any rule. If jf = 


2 .X'i < 0 then continue only with Bi until 6 % is accepted. If 0 < y < 2/i then 
1-1 


continue only with Rt until either 9a or 84 is accepted. Uy ^ 2h then stop taking 
observations and accept Ha 
It is clear that the following inequalities hold 


(6.16) B(n/e, Ria) > B(n/9, Ru) > Bin/O, R). 


The proximity of Ein/ 6 , Ru) and E{n/ 6 , R), as stated above, is based on the 
fact that the probability of accepting di, when fls < ^ < ^ 4 , is small. 

The proximity of Bin/9, Ru) and Bin/9, Ru) is assured if the probability 
of terminating with Ru (and with R) brfore N observations is small. It can be 
shown that the latter condition is fulfilled when A < X. In terms of the quan¬ 
tity r defined in Section 5 this can be written as r > 3. 

To determine the value of Ein/9,Ra/) the following two preliminary results 
will be needed: 



A SEQUENTIAL DECISION PROCEDURE 


517 


If 0 < 2 / < 2 /i, 


"i _ -(zMOv-WCaft-v)-! 

/ ^ \ i, 2h ~ y ~ 2h — -— , . -- 

(6.17) E (n/e, B'u , E = 2 / ) = J +_ [ i - j 

\ >-i /a I9 — \ 

= C (say). 

If 2 / < 0 , 

(6.18) E (n/e, EL ,f:x, = - -1- = D (say). 

Both are easily obtained from formula (7.25) on p. 123 of [1], 

Multiplying with the density of y and integrating with respect to y, we obtain 
after simplification 


E{n/6, R',/) = J + 




fe 2 e' 




(6.19) 


(\-0)IX ^ ^ 


+ 


X(X - 6) y 2ir 

he 

2X(X + d) 


h\ _ g-X(2X-8)2/2X»»j 




-CX6*/2Xit*) 


Formula (0.19) is an improvement on (6.14) as it will give for any 9 a smaller 
upper bound, but in the neighborhood of the origin the difference is insignificant. 
For 9 = X we obtain from (6,19) using L’Hopital’s rule 


( 6 . 20 ) 


Ein/\, R',,) = ^ (m - 3v“) 

<7^ 4X0-^ 




-Wirt) 


If 


Vh\ 


> 2.5, the above formula can be approximated by 


( 6 . 21 ) 


h , 2 <r , /h\ 


E{n/\ + 


n 

Since the right hand member above lies between - and (1.002) -= when 
/h\ 

—— > 2.5 then for practical purposes 


( 6 . 22 ) 


Ein/\ RU) ~ \ 

<T* 


when > 2.5 


5 ). 



518 


MILTON BOBBL AND ABRAHAM WALD 


An upper bound for E{n/B, R) for di < 6 < 8i can be obtained by defining 
Rn and Rn in an analagous way to Ru and Rh Because of reasons of symmetry, 
E{n/d, R 12 ) can be obtained from (6.19) by replacing fl by —8. 

The method used for obtaining upper bounds for E{n/6, R) can easily be 

1 1 

extended to the more general case when the equalities = ^=^do not 

necessarily hold. However, the resulting formulas are more cumbersome and we 
shall merely give without proof the upper bound corresponding to (6.14). This 
upper bound becomes 

B(n/S, b;.) - N + (f^")[i - ♦(«)] + - *(»] 


Y ^TT -"0 X 0_ 


where 


hi = - log A 
hi = ^ log A 


ho == - log 5 
hio = ^ log .S 


hi — hi 


Oj = -tti 

h - Ne 
cVN ’ ^ 


h + Ne 


7. An Example. We shall consider the following example 


hi + ho 

2 


<r — I, $i = $2 = — Aj == Aj ^4 = A, yi = Yj =* 78 = 7 = .029 

then 


A = i = ^ = i = - -2 = 33,5 

Jj IS y 


r = 7»3>k^ 1,47 


^ log A = 28, X = 


i, A = 02 ~ 9i = di — 


Using formulas (6.1) and (6.7) the following upper and lower bounds were ob¬ 
tained 



8 

9 

10 

12 

14 

16 

18 

20 

16 

16 

16 

16 

16 

16 

16 

16 

112 

89.6 

74.7 

56 

44.8 

37 3 

32 

1 

28 




























A SEQUENTIAL DECISION PROCEDURE 


519 


Formulas (6.14) and (6.1) yield 


B 

0 

1 

2 

3 

10 

16 

16 

Upper Bound , ... 

146 

163 

229 

450 

Lower Bound . 

112 

149 

224 

421 


In the neighborhood of the origin the true value is very nearly the upper bound. 
From formulas (6.19), (6 22) and (6.1) we obtain 


6 

3 

16 

4 

16 

5 

16 

Upper Bound 

422 

784 5 

423 

Lower Bound . 

1 

421 

784 

421 


As shown above for the end points of the indifference zone, (6.19) gives better 
results than (6.14) or (6.7). This is as it should be since (6.19) takes into account 
possibilities omitted in (6.14) and (6.7). The greater accuracy of (6.19) is offset 
by a slight increase in computation. 

In the graph of the Bounds of the ASN function shown in Figure 2, a single 
curve is shown wherever the upper and lower bound are sufficiently close to 
each other. 

Since (6 14) contains an even function of 6 and since elsewhere the correspond¬ 
ing bounds are mirror images with respect to 0 = 0, the bounds for negative 6 
are exactly the same as those for the corresponding positive d 

Consider the following non-sequential rule applied to our problem. With a 
fixed number No of observations compute the mean x and accept Hi if x falls in 
the interval { — fli), accept Hi if x falls in [oi, Oj] and accept Ha if ® falls in 
(fli, «). This IS certainly a reasonable procedure. One can also verify, that no 
other non-sequential rule exists that is uniformly better (for all possible values of 
6) than the one under consideration. 

The two decision procedures become cpmparable it we introduce the indiffer¬ 
ence zones and define a wrong decision in the non-sequential case exactly as was 
done for our sequential procedure (see Section 1). 

For the non-sequential case (just as in the sequential case) the probability of a 
wrong decision will be discontinuous at 6i, do, do and 6i. At each of these points 
there will be a left-sided and right-sided linut, different from each other. As in the 
sequential case we shall take the probability of a wrong decision at a discontinuity 
point to be equal to the larger of the left and right hand limits One can easily 
verify that the maximum probability of a wrong decision occurs a,t 6 = do (which 
is equal to the value at 6 = 9i). 




620 


MILTON BOBEL AND ABBAHAM WALD 


We then determine iVo by setting the maximum probability of a wrong decision 
equal to y, i.e. 

(7.1) 0 (tuM vTc) + VFo) « 1 - 7. 



For the particular problem considered above, tliis gives Wo “ 016.4. Hence 
916 observations are required in order to ensure’ that this non-sequential pro¬ 
cedure will have the maximum probability y = .029 of a wrong decision, This 
is to be compared with the maximum over all S of the ASN function in the 
sequential procedure, which was 784.6. 

Returning to (7.1) we shall derive lower and upper bounds for the root of that 
equation. Since 

„ > vFo ^ I VF, 

a Zu 









A SEQUENTIAL DECISION PHOCEDURB 


521 


it is clear that the root of the equation 

= 1 - T 

is an upper bound for the root of (7.1) and that the root of the equation 
</>(“) + = 1 — 7 


or 




is a lower bound for the root of (7.1). We shall compare the value of a: = —\/Nq 

2(7 

with the value oiy = — \/Max ASN. Since 

Z<T 6 

^ 22 /^ _ \1 . 

Max (ASN function) rv/ _ = ^log —j (for sufficiently small ^ ), 


then 


A —- 1 <y ^ 

= r- V Max ASN log -1 (for sufficiently small -?). 

2(7 g ^ y d 


The following table gives upper and lower bounds for x and the corresponding 

value of y for the type of example under consideration, i.e., when A = A = -= -f, 

a ts 

and r '^k. 


7 

001 

.002 

005 

.008 

01 

.05 

1 

X and S 

3.P8-3.31 


2.67-2.81 

2.41-2.65 

2.33-2.68 

1.64-1.96 

1 28-1.66 

y 

3.46 

3.11 

2 85 

2.41 

2.30 

1.47 

1.10 


As the table shows^ for .1 > 7 > .008 

X > $ > y 


* Actually, the inequality in question is shown only for the values of 7 given in the 
table. However it can be verified that the inequality remains valid for all values of y be¬ 
tween .1 and .008. 


















522 


Mir-TON SOBEL AND ABRAHAM WALD 


and hence 

A^q > Max ASN (for sufficiently small 

I a 

The statement and the table above are not meant to delimit the region in which 
the sequential rule is superior to the non-sequential procedure. 

REFERENCES 

[1] Abiiaham Wald, Sequenital Analyaia, John Wiloy and Sons, 1947. 

[2] P. ARMiTAaE, "Some sequential teats of student’s hypothesis," Supplement to the Journal 

of the Royal Staiialical Society, Vol. 9 (1947) No. 2, p. 250. 



MOMENTS OF RANDOM GROUP SIZE DISTRIBUTIONS* 

By John W. Ttiket 
Princeton University 

1. Summary. A number of practical problems involve the solution of a mathe¬ 
matical problem of the class described in the classical language of probability 
theory as follows. “A number of balls are independently distributed among a 
number of boxes, how many boxes contain no balls, 1 ball, 2 balls, 3 balls, and 
so on.” Problems arising in the oxidation of rubber and the genetics of bacteria 
are discussed as applications. 

A method is given of solving problems of this sort when “how many” is 
adequately answered by the calculation of means, variances, covariances, third 
moments, etc. The method is applied to a number of the simplest cases, where 
the number of balls is fixed, binomially distributed or Poisson and where the 
“sizes” of the boxes are equal or unequal. 

2. Introduction. The distribution of the number of empty boxes has been 
investigated by Romanovsky in 1934 [3], and, apparently independently, by 
Stevens in 1937 [4]. Romanovsky investigated the case of N equal boxes and 
m balls for (i) the case where the balls are independent, and (ii) the case where 
there is a limit to the size of each box. He gives no motivation for the problem, 
and shows that certain limiting distributions approach normality. Stevens 
investigated the case of m independent balls for N boxes (i) of equal size, and 
(ii) of unequal size, and developed a useful approximation for the last case, 
Stevens was concerned with this problem in order to test box counts for non¬ 
randomness by comparing the number of empty boxes with expectation. The 
reader interested in that problem is referred to his paper. 

The results derived in Part II are based on the use of a chance generating 
function, a technique which applies easily to the case where the balls are inde¬ 
pendent. Thus Romanovslcy’s results for the case of boxes of limited size are 
neither included or extended For the other cases where the number of empty 
boxes has been considered, the results below seem to provide simple moments 
and cross-moments for the numbers of boxes with any number of balls to the 
extent previously available for the number of empty boxes. Both Romanovsky 
and Stevens investigated the actual distribution of the number of empty boxes. A 
similar investigation of the distribution of the number of b-ball boxes has not 
been carried out here. 

3, A chemical problem. In studying the oxidation of rubber, Tobolsky and 
CO workers were led to propose the following problem: ‘ If a mass of rubber 
originally consisted of N chains of equal length, if each chain can be broken at a 

* Prepared in connection with research sponsored by the OfEoe of Naval Research. 

523 



524 


JOHN W. TUKEY 


large number of places by the reaction with one OJfygen molecule, if there are m 
morgen molecules each equally likely to react at each link, and if mNp molecules 
have reacted, what is the probable number of original cliains which are now in 
b 4 - 1 parts as a result of h oxygen molecules having reacted with h of their linlts? 

Here an original chain plays the role of a box and an oxygen molecule the 
role of a ball. The sort of numbers which may he taken cliaracteriatic are: 

N ==■ 10 “ (number of chains), 

m ® 10 “ to 10 “ (number of oxygen molecules), 
mp = 0.01 to 100 (average breaks/chain). 

Thus it is almost certainly going to be appropriate to use the results obtained by 
assuming N and m very large and p = 1/N very small. We shall return to this 
example after discussing the general results, 


4, A bacteriological problem. The experiments of Newcombe [1] on the 
irradiation and mutation of bacteria have prompted Pittendrigh to propose the 
following problem: "Suppose a large number of bacteria each contain m enzyme 
particles, which have been formed by the action of a nuclear gene. Suppose 
that irradiation destroys the nuclear gene in a certain fraction of the bacteria, 
Suppose three generations to occur, during which the m original enzyme particles 
arc randomly distributed among the 8 descendants of an original bacterium. 
If a bacterium •adthout either nuclear gene or enzyme particle is a recognizable 
mutant, what is the expected distribution of "families" with 0, 1, 2, 3, • • ■ , 8 
mutants?" 

Here the enzyme particles are the balls, and the 8 descendants are the N 
boxes. We are interested in the number of empty boxes—the problem is that 
discussed by both Romanovsky and Stevens, with the exception of an allowance 
for cases where the nuclear gene was not lost, We shall return to this problem 
also after discussing the general results. 


6 . The case of large numbers. In case the number of “balls” and "boxes" is 
large, it is natural and has been customary in similar problems to replace discrete 
variables by continuous, and derive differential equations. The process runs as 
follows: Let yo, z/i, 2 /s, ■ ■ • , 2/t, • • • be the/roc/tVns of the total number of boxes 
containing no, one, two, • • ■ , b, > • • balls. Let t be the average number of balls 
per box (artificially made continuous, so that we may, for example, have a total 
of 13 + 3ir balls). Increase t to i + dt, then of the yo boxes previously containing 
no balls, yo dt will receive one. Of the yi boxes previously containing one ball each, 
yi dt will receive a second, and so on. Hence 


dt 


-Vo, 


dyi 

- = yo-y,, 



RANDOM GROUP SIZE DISTRIBUTIONS 


525 


dyt, 

dt 


and if we start, when t = 0, with j/o = 1, and 2/4 = 0 for 6 > 0, we find 




b = 0, 1, 2, 


The usefulness of this result has sometimes been in doubt, thus Opatowski 
[2, p. 164] says in a similar connection: “Consequently ■ ■ ■ the theory appears 
less accurate for small values of t ” 

It is shown in Part II that; where Ui boxes out of the total of N contain 
exactly h balls; (I) When the number of balls and boxes is large and fixed, (1) 
is a good approximation to the expectation of ni/N. (II) When the total number 
of balls has a Poisson distribution, and t is interpreted as the expected number, 
(1) reproduces the expectation exactly. Since it is appropriate in moat problems 
involving chemical reactions or irradiation to take the number of balls as having a 


TABLE 1 


A fixed or binomial number of balls and equal boxes 






526 


JOHN \V. TUKEY 


Poisson distribution, the caution suggested by (I) is often sliown unnecessary 
by (II). For this type of problem the differential equation is entirely adequate! 

It 13 further shoAvn in Part II that, in the Poisson case, the second moments 
are exactly those which correspond to random sampling from an infinite popula¬ 
tion with the fractions indicated by the moan number of boxes with 0, 1, 2, • • • , 
b, • • ■ balls. This result is not accidental, and it is shown in Part III how W'e can 
see directly that the whole distribution in this cose is that of random sampling 
from such a population. 

6 . The case of small numbers. The results of Part II also allow us to state 
the means, variances, and covariances, for the cases where the differential 
equations do not apply. The re.sults are set forth in the following tables: Tables 1 
and 2 apply to the cases where m balls are distributed among the given boxes 
and possibly others. Thus the total number of balls in the given boxes is either 
fixed, when there are no other boxes, or follows a binomial distribution, 


TABLE 2 

A fixed or hinomial number of balls and nneqxial boxes 


PIYPOTHESIS 

A total of m balls are independently distributed into N boxes or elsewhere, the 
chance of a particular ball entering the zth box being p,. The average of the 
Pi - p. The sura of the squared fractional deviations of pi from p is A. 
Pi = p(l -|- X,), 2, X? = A. Terms in 2,-Xj, 2(Xj, etc, are to be neglected. The 
number of boxes each containing exactly b balls is nt . 


Mean of m, = EM = N (1 - times 

{(l + ^ 2 ) - (m - b)p'^ - b(l - p^) I 

Variances and covariances as in Table 1, using 


where ^ = 2bc 


- s) 


+ terms in and in 


The exact value of ^ is given in Section IG. 





RANDOM GROUP SIZE DISTRIBUTIONS 


527 


TABLE 3 

Poisson balls and equal boxes 



7. Discussion of tiie chemical problem. The number of oxygen molecules 
which have reacted in a given time is, at best, distributed Poisson Thus the 
differential equations would give the expected number of cuts, even if the 
number of balls or boxes were not large. 

The fact that the numbers of balls and boxes, are large makes the variances 
and covariances so small as to be practically unimportant. Thus, for example, 
with N = lO'®, f = 1 (1 break per chain), we have: 

mean of tio = - X lO'®, 
e 


mean of Wi = - X 10^®, 
e 


variance 


variance of ni 


of 710 = X 10”, 

of m = i X 10“, 


covariance of no and Wi 


X 10” 


Thus the standard deviations are less than 1 part in 100 million of the mean. 





528 


JOHN W. TUKEY 


TABLE 4 

Poisson halls and varied boxes 


HYPOTHESIS 

A number of balls wth -the Poisson distribution are independently placed in N 
unequal boxes. The expected number placed in the fth box is U , The average of 
the U is I, t, = i(l + Xv) and 2 ,-Xi = A. Terras in S.x!, S^XJ, etc. are to be 
neglected. The number of boxes each containing exactly b balls is nj,. 


Mean of n, = = N~ (l + ^ - w) 

Variance of Wi = Eirib) ~ 

Covariance of nj, and ne = + ^ ((b - l)(c - 0)S(n0E(n„)) 


Mean of no = Ne~‘ (1 + 


Mean of ni = Vte~‘ (1 + 


2N/ 

A(f® - 20' 
2N , 


Variance of rto = Ve * ^1 + 

Variance of Ui = Nle~‘ ^1 + - - 2 jy ~ ^ 

Covariance of no and ?h = ~Nl^e~^‘ ^1 + 


Covariance of no and m = ~Nt e (1 + 


8 . Discussion of the bacteriological example. Although this example came 
from an irradiation experiment, we are not entitled to jump to the Poisson 
case. The balls are not actions of radiation, but rather previously existing 
enzyme particles. The purpose of the radiation is merely to make a failure to 
hand down a particle obvious. 

For simplicity, let us begin by assuming that the irradiation has been strong 
enough to knock out all the nuclear genes and none of the enzyme particles. We 
face the following problem: “If the m enzyme particles are divided by chance 
among 8 descendants, what should be the distribution of mutants, that is, of 
boxes with no balls?” 

As far as mean and vanance, we can answer this question from Table 1, 
with V = 8 and p = \. 




RANDOM GROUP SIZE DISTRIBUTIONS 


529 


The results are 

mean number of mutants = Eirio) = 8(J)'", 
variance of same = 8(1)” — 64(|)^” + 56(|)”, 
For small values of m we get the values tabled below: 

TABLE 5 


Blanks oiii of 8 


m 

mean 

variance 

/ mean\ 
mean 11-^ 1 

0 

8 

0.000 

0.000 

1 

7 

0.000 

0.875 

2 

6125 

.109 

1.436 

3 

5.359 

.262 

1.769 

4 

4.689 

417 

1.941 

5 

4.103 

.556 

1.998 

6 ' 

3.590 

.666 

1.979 

7 

3.142 

.747 

1.908 

8 

2 749 

.799 

1.804 

9 

2 405 

.825 

1.682 

10 

2.105 

829 

1.651 

15 

1.079 

.663 

934 

20 

0.554 

.426 

.515 


We notice that the variance is substantially less than the mean. 

Now it might be that the number of enzyme particles is not constant from 
bacterium to bacterium. It would not be unreasonable if it had a Poisson dis¬ 
tribution. If this were the case, we would revert to the differential equation 
solution, which is also given in Table 3. The last column in Table 5 shows the 
variance which would then arise for the same means. The variance is still some¬ 
what less than the mean. The situation is shown graphically in Figure 1. 

If the actual distribution of «o is desired, then it can be calculated for the 
case where m is fixed from the tables in Stevens' paper [4], and when m is 
distributed Poisson it is merely a binomial distribution. 

PART II 

DERIVATIONS 

9. The chance generating function. We are considering the following class of 
problems: “balls” are placed independently in “boxes” and then the number wo 
of empty compartments, the number ni of compartments containing exactly 






530 


JOHN W. TUKEY 


one ball, ■ * • , the number m of boxes with exactly h balls, and so on, arc observed. 
We are interested m the moments of no, , nj, • ■ • , Uh, • • • both simple 

and mixed. 


RATIO OF VARIANCE TO MEAN 
FOR NUMBER OF EMPTY BOXES OUT OF EIGHT 



Figure 1 

We define chance quantities Xiq by 

lx, gth ball in the tth box, 

Xif, = j 

[1, otherwise. 

Clearly the product of all for fixed i is given by 

— iC (number of balls In the fth box) 

Thus = x* if and only if there are exactly b balls in the ith box. Hence 
the coefficient of in , the sum of over all boxes i, is nj, the 

number of boxes containing exactly h balls. 




RANDOM GROUP SIZE DISTRIBUTIONS 


531 


We have the relation 

= f{x) = S,nja:, 5 , 

where f(x) is a chance function, and the nt and the are chance quantities. 

Now we take expectations of both sides, and use the fact that the expectation 
of a sum is the sum of the expectations to obtain 

= E(fix)) = 

Now Xtg and Xtr , for q 9 ^ r, are independent since they are determined by 
different and independent balls. Hence HCIIja;,,) = UjiEx^^ and we have the 
basic formula 

( 1 ) = '2'bx'‘E(nb) = 2iTlgE{x,,). 

10. Higher moments. By extending this device, we can obtain generating 
functions for higher moments. Instead of the x ^^, we introduce a whole sequence 
of chance quantities a:,g, i/ig, , • • , , defined by 

I (x, y, ■ • , w), gth ball in tth box, 

(1, 1, • • • , 1), otherwise. 

We find immediately that 

'' ■ /('^) ~ (2»IIga;,g)(Sjnr2/;r) ■ ■ • 

“ * * * IVnq ■ 

Taking expectations on both sides 

■B(/(x)/(j/) ' ’ • /(in)) = 2,2, • • • XnE(Ilffl:,jyjg • ■ • Wng) 

~ 2,2j ' • ’ ^n-TlqE(xtqyjq • * • mng), 

where we have used the fact that x^qV,, ■ ’ ■ w„q and x,vi/jr ■ Wnr are independent 
when q 9 ^ r since they are determined by different and independent balls 
On the other hand, 

■ ■ ■ /(w)) = (26nia:'’)(2.n.i/') • • • (2„n.in“) 

= 262o • • • XainiUc ■ ■ naXx’y" • • • in“) 

BO that 

E(J(.x)f{y) ■ ■ ■ /(in)) = 2^20 • • • 2a(xV ■ ■ ■ in“)B(ni,n, • • • n„). 

Equating the two expressions for the expectation of fix)f{y) ■ ■ ■ /(in), we have, 
finally, the generating function for E{nt,nc • • ■ /!«) in the form 

(2) ^ {xy° • • • in'*)®(liftHo • ■ • Ha) = ^ ^qE{x{qy,q • • • Wnt)- 

biC,’ 'lO 

Thus a knowledge of jB(x,gi/,g • • • Wng) will allow us to determine the moments 
of the h’s. 



582 


JOHN W, TUKEY 


11. A fixed or binomial number of balls and equal boxes. Let there be iV 
boxes, and m balls, each with probability p of entering each box. If piV = 1 we 
have the case where m balls always appear in the boxes taken together—the 
case of a fixed number of balls. If pN < 1, the number of balls appearing in all 
boxes taken together is a binomial with expectation mpN'. 

Now X{, equals 1 with probability 1 — p and equals x with probability p, 
hence (1) becomes 

2tx‘‘]S(nb) - S,11,(1 — p + pi) « iV(l — p + px)”'. 

Using the binomial theorem, the coefficient of i’’ is 

(3) Eirib) = N (j'j (1 - pr-y = N (1 ~ p)” 

Now if p is small, we may approximate 1 — p by e”'’ and by 1, respectively, 
in its two occurences, where 

j0(no N(~y~'"y 

and if m is large compared to b this becomes 

Eifib) 


12. Second moments. We must study E{x{,,yiq). If i = j then this is 
(1 — p + pxy) since the qth ball falls into both the fth and ;th boxes with proba¬ 
bility p, otherwise into neither. If i j, we imtnediately find the expectation 
to be (1 — 2p + pi -f- py). 

Hence, since i = j in N cases, and i 9 ^ j in N{N — 1) cases, 

= ^(1 - p -t- pxy)”' + E(N — 1)(1 - 2p + pi + py)”, 
by (2) this equals Xb^^y^EinbUc), and using the multinomial expansion we find 

EM = N(N - 1) (1 - 2 p)”-^y^ -1- 5(6, c)N (1 - p)'"-y, 


where 5(6, c) = 
coefficient 


1 when 6 = 
is given by 


c and is zero otherwise, and where the multinomial 


m\ ^ _roj_ 

he) 6!cl(m — 6 — c)l‘ 


We now set 
(4) 


E(nbnc) = E(rib)Eina)^(Ji, c) -f- 5(6, c)E(nb) 



RANDOM GROUP SIZE DISTRIBUTIONS 


533 


when 


c) = 


]V<Ar-l)(")(l-2rt-(r^)”' 


(l) 0 - (r^J ^ (:)» - *’>■ (r^J 


(5) 


= (i-l\ ( i-2p Yri:: 

V iV/ \(1 - p)(l - P)J Vl - 

u(u — 1 ) • • • (m — 6 + 1) denotes a descending factorial with h 


- P\^° 

2 p) 




where u 
factors. 

Notice that, if the ni, were independently distributed in Poisson distributions, 
the second moments would be given by the same formula with ^(b, c) = 1, 
while if they were distributed like a multinomial sample from an infinite popula¬ 
tion the second moments would be given by the same formula with 4>(b, c) = 1 — 


For small p, we have 


and if m is large compared to b and c, this approaches the multinomial value 

$(6, c) 


0 -^)- 


13. Variances and covariances. The variances and covariances are given by 
Variance (nj) = E{ntnt,) — E{ni)E{ni) 

= E{nt){l - (1 - $(b, b))E{n,)), 
and 

Covariance (ni, ,n^= —(1 — ^(b, c))E{nb)E{ni,). 

Thus the covariance of m, and no will vanish when, and only when $(6, c) = 1. 
Let us suppose pN = i, with p small and m and N large, and see if ${i), c) 

can be unity. Smce a preliminary calculation shows it to be reasonable, let us 
put m = yN. Then 

*(!,,»)«(!- tp) (1 - pV'd + P)'*. 



534 


JOHN W. TUKEY 


An easy calculation shows that the ratio of descending factorials is nearly 

^—bclyH _ 

making further natural approximations, 

In 9{b, c) W — /3p — — p — yNp^ + (5 + c)p 
y 

and this may be written 

In $( 6 , c)P:i~^(^l^2j-b-c + pJ + ipc-(b-fi~ 

and this vanishes for real y when and only when | & — /3 — c | > This, 

then, is the condition on b and c which permits the existence of two ratios of m 
to N so that for either ratio and large JV there will be no coiTclation between 
Tib and Ue. 

14. Higher moments. To deal with the third moments, we need j5i(a:„i/„ZA,), 
which is easily seen to behave as follows: 


Relation of ijk 

number of occurrences 

Expectation of a,-, yf„ zkq 

1 = j = k 

N 

1 - p + pxyz 

i = j Ic 

NiN - 1) 

1 — 2 p + pxy + pz 

i - k j 

N(.N ~ 1) 

1 — 2 p + pxz + py 

'll. 

II 

NiN - 1) 

1 — 2 p + pyz + px 

different 

N(N - 1){N - 2 ) 

1 — 3p + px + py + pz 


Thus we have 

'Sbedx'y z'^Eiubiicni) = A^(l — p + pxyz)”' + N{N — 1)(1 — 2p + pxy + pz)”' 

+ N{N — 1)(1 — 2p + pxz + py)"" NiN — 1)(1 — 2p + pj/z + px)™ 

+ N{N - IXN - 2)(1 - 3p + pa; + py + pz)"* 

from which we can calculate all third moments. 

In general if e is a decomposition of the product xyz ■ ■ • w into a monomials 
Ul ,Ui, • ■ • ,Ua , where order is disregarded (for example; xyz = (xz)y = (zx)y = 
y(zx) = y(xz) is a single decomposition with a = 2, Ui = xz, un = y), then the 
generating funetion becomes 

_j_ .,. _j_ q:)p)”. 

16, Poisson balls and equal boxes. To reach a Poisson distribution we let 
m —> CO and p —> 0 so that mNp = tN, where t is the average number of balls 
per box in the Poisson distribution. 








RANDOM GROUP SIZE DISTRIBUTIONS 


535 


Since 



under these conditions, (3) becomes 

( 6 ) E(n,) = N~e~^ 

01 

and from (5) it follows that the limit of $(6, c) is ^1 — so that 

,b+c ,b 

(7) EinbU.) = N(N - 1) + S(b, c)N~ e-\ 

old o' 

and hence 

(8) Variance {ub) = N e~'^ , 

(9) Covariance {ub ,n^ = ® 

Notice that these are the moments of the numbers of objects of types b, c, ■ , 
in a random sample of N from an infinite population where the fraction of 
&’s is t'' e~‘/h', just as it should be. 

16. Fixed or binomial balls and varied boxes. We now consider the case 
where the chance of any ball entering the ith box is . We shall again not 
restrict ourselves to the case S,-p. = 1 

The expectation of x^g is immediately seen to be (1 + pi{z — 1)) = 
(1 — Pi + -pix), so that the generating function is 

f{x) = S.(l - Pt + p.*)” 

and the expectation of ni is 

(10) EM - (^) s,(i - p.)- p! - (;*) 11(1 - p,)- (j-^J. 

Following Stevens [4] with a slight modification, let us set p, = p(l -f- A,), 
where p is the average of the p,-, so that S,X, = 0. Then 

(1 - p.) = (1 - p(l + Xi)) = (1 - p) (l - 

so that 

2i(i - pj = (1 - pr-^’p 2. (i - ' (1 + 




536 


JOHN Vf. TUKEY 


Expanding the summand, we find 
- b)p 


1 + 




, /(m - b)(m - b ~ l)p‘ (m ~ b)bp , bib - 1)\ 

+ \— - / ‘ 

Hence, setting S,Xf = A (notice this is not the same as Stevens’ A!), we have 


Ein,) 


n (1 - p)’^p'> 


iV + M 


/ m ~ h 
\m — b — 1 


(p(m — 1) — !>)* him — 


A)) 


(1 — p)» m — 6 

The expectation for all pi = p has been modified by multiplication by 
(H) ^ ' A / m ~ b ip(m — 1) — b)* bjm - 1) 


+ 0(2,X?). 


2N\m - b - 1 (1 ~ py~ 


m — b 




plus terms of higher order. For large N and consequently small p the quantity in 
braces is nearly 

b (b - 

\ m ~ bj 

and more roughly is approximately b’. Similarly, the expectations of second 
moments are 

Ein,n.) - Z (1 - p, - pr-^‘p\p] + i{b, Zf (1 - Pi)'^p'i, 


whence 


( 12 ) 


3>(b, c) - 


fc)ga -p. - Pi)-^v\v-, 


(7)(”)r.(i - pJ"-‘p!E,0 - P()—p! 

Making the same sort of expansion yields 

«> K’ - s) (’ - (i^X 

where terras in SX* have been neglected (note that 

Z = —Z = —A'), 

ift) • 

and where 

m — b — c N — 2 


'/' = 


m-b-c-lN-1^^ 2p) * ^ _ j, _ 1 (1 - P) 


■ {p(wi - 1 ) - b}® 



RANDOM GROUP SIZE DISTRIBUTIONS 


537 


m ~ b — c N — 2 f. o \-2 m — c ,, ,_2 

i . - t _ , _ 1 aT^I (1 - ^ 


' (p(m - 1 ) — cp 


1 m — b — c ! N N — 2 


2m-b-c-l\N-l N - 1 


(1 - 2 p) > (b - c) 


m — b — c — 1 \N — 1 m — 6— 1 m — c — 1 


This can be reduced to 


^ = 2bc (2p -^) + 0(p“) + 0 
h 0 (p“) + 0 ^^^- 


and for p = 1/N + 0(p ) + 0 


ip = 2pbc + 0(p“). 

17. Poisson balls and varied boxes. To reach the Poisson limit, we let m —> «) 
and p, —> 0 so that mp, — ti. The generating function for first moments becomes 

fix) = 

and the expectation of rii is 
(15) 

If we set i, = <(1 + Av), this becomes 

EM = -^e-'2.(l + 

The summand expands in the form 




X - <X. + + ■■■) 

= 1 + (6 - OX. + - bt + 0 X! + 

If t is chosen as the average of the so that SX,' = 0, the sum becomes 
U + EX? + SX? + ••• . 


— bJ + n ) X. + 


Again setting SX? = Awe have 




(b - t)‘ - 



538 


JOHN W. TUKBY 


which can be written 

Bind ^ r‘ (l + A ((6 - ()* - W). 

The generating function for the Bccond momenta is 

mm = s.-, 

so that the expectation of mnc is 

(„) *(.».). 2 + 

which becomes 

Ein),nc) = f-r~, e 2 (1 + + Xj)'e + 5(5, c).®('4), 

ole! 

whence we can derive 

(18) <I>(5, c) ^ ~(b ~ Die ~ D. 

Tims 

(19) Variance (m,) » EinD “ ^ ~ (-®(^))^ 

(20) Covariance inbnD ^ ^ ~ ~~ EinDEinD. 

18. Boxes in. a systematic square. Another case which it may be worthwhile 
to write down arises when the boxes arc systematically "rotated” under "spouts” 
of different probability. That is, the number of balls to is a multiple of the 
number of boxes N, and the probability of the gth ball entering the fth box 
depends on the value of g — f taken modulo N. An example for iV = 3 and 
TO = 6 follows: 


ProbabiliHes of entry 


Box 

Ball 1 

2 

3 

4 

5 

6 

1 


Pi 

Pi 

Po 

Pi 

Pi 

2 



Pi 

Pi 

po 

Pi 

3 



po 

Pi 

pi 

Pi 


If TO = kN and the subscript r runs through 0, 1, 2, ■ • ■ , V — 1, then the 
expectation of fix) becomes 

= JV{n.(l - p, + p^)}" 
















EANDOM GROUP SIZE DISTRIBUTIONS 


539 


Thus first moments, and by proceeding similarly higher moments, are available 
for this case also. 


PART III 


THE POISSON CASE 


19. The Poisson case with equal boxes. The Poisson case is obtained in the 
limit as m 00 and p -4 0 with pm = t We wish to show that, in the limit, the 
number of balls m the different boxes are independent. Let ^:i, A: 2 , • ■ ■, ks be 
the number of balls in the first, second, • • •, Nth. box, respectively. Then the 
distribution of the k’s is given by, where we write /c = fci + -f ■ • • + , 


m 


(i) 


fcll/Cjl * ■ * ky\ 


- Nv) 


m—h 


(1 - 

’ K\ 


Now the first two fractions clearly approach unity in the limit, and the inde¬ 
pendence IS proved. 

Since the number of balls in each box has an independent Poisson distribution, 
the distribution of the numbers of boxes each with exactly h balls is that of a 
random sample of N from an infinite population—namely it is a multivariate 
distribution with probabilities 

^ ■ 


EEPERENCES 

[1] H. H. Newcomb®, “Delayed phenotypic expression of spontaneous mutations in Esc/ier- 

icha coli," Genetics, Vol S$ (1948), p 447-476 

[2] I. Opatowski, "Cham processes and'their biophysical applications; Part I. General 

Theory," Bulletin oj Mathematical Biophysics, Yol t (1946), p, 161-180 

[3] V. Eomanovsky, “Su due problemi di distribuzione casuale,” Giornale dell'Istitulo 

Italiano degli Atluan, Vol S (1934), p. 196-218. 

[4] W. L. Stevens, "Significance of grouping,” Annals oj Eugenics, Vol 8 (1937-1938), 

p. 67-69. 



THE POWER OF THE CLASSICAL TESTS ASSOCIATED WITH THE 

NORMAL DISTRIBUTION 

By J. WoLFOwm 
Columbia Univerdiy 

Summary. Tho present paper is conceme<l with the power function of the 
classical tests associated with the normal distribution. Pnxjfs of IIsu, Simaika, 
and Wald are simplified in a general manner applicable to other teats involving 
the normal distribution. The set theoretic structure of several tests is charae- 
terized. A simple proof of the stringency of the classical test of a linear hypothesis 
is given. 

1. Introduction. The present paper is concerned with the optimum properties, 
from the power function viewpoint, of the classical tests associated with the 
normal distribution. In 1041 Hsu [2] proved the result stated in Section 2 below, 
which is concerned with the general linear h 5 'pothesis (in this connection his 
paper [1] of 1938 will be of interest). Also in 1941 Simaika [3] proved similar 
results for the tests based on the multiple correlation coefficient and Hotelling’s 
generalization of Student’s 1 . In 1942, Wald [4] gave a generalization of Hsu’s 
result. 

In the present paper we give short and simple proofs of almost all these 
results, and a simple proof of the stringency property of the analysis of variance 
(Section 5). These proofs rest on theorems which characterize the set theoretic 
structure of the tests. Thus, while the proofs of Hsu, Simaika and Wald are 
rather elaborate and each problem is essentially attacked de novo, tho methods 
of the present paper are in effect applicable to tho classical tests based on the 
normal distribution. For those tests it will not be difficult to demonstrate the 
analogues of Theorems 1 and 3, and of the results of Hsu, Simaika, and Wald. 
In the present paper we first treat the general linear hypothesis, because it is the 
simplest problem, its solution is easiest to describe, and it admits Wald’s integra¬ 
tion theorem. Multivariate analogues of the latter are rather artificial and not ns 
simple. We then discuss the problem of the multiple correlation coefficient, 
because it seems to be more difficult than that of HoteUing’s T and indeed, to 
include all tho essential multivariate difficulties, Theorems 6 and 7 are tho 
analogues of 1 and 3, respectively, while Theorem 9 describes tho essential 
property of tho power function which is of interest to us. In other multivariate 
problems one will prove the analogues of Theorems 0, 7 and 9. A generally 
inclusive formulation is no doubt possible. Theorems 5 and 9 are slightly more 
general than the theorems of Hsu and Simaika, 

Many of the statements below may be not valid on exceptional sets of measure 
zero. Usually this is so stated, but sometimes, for reasons of brevity or to avoid 
repetition, this qualification may be omitted. The reader will have no difficulty 
supplying it wherever necessary. 


540 



POWER OP THE CLASSICAL TESTS 


541 


Tlie author is indebted to Erich L. Lehmann of the University of Cahfornia, 
who carefully read a first version of this paper. Theorem 4 below was arrived at 
independently by Professor Lehmann, with a somewhat different proof 


2 . The general linear hypothesis. In canonical form the general linear hypothe¬ 
sis may be stated as follows: The chance variables 

Xi, Xt, • ■ , Xk+i 

have at Xi, • • ■ , Xk+i , the density function 

(2 1 ) (VStt a) exp 1 ^— (®* — = fiv, v) 

with O', 171 , • • ■ , 1 ) 1 , all unknown. 

Let ri be the vector (i/i, ■■■, 11 k)- The null hypothesis Ho states that 

1)1 = • • • = i?fc = 0 

and IS to be tested with constant size a < 1 (identically in a). 

Let D be any admissible critical region for testing Ho • If A is any event let 
P{A\v, O') denote the probability of A when rj and a are the parameters of 
(2 1) We have then 

P{I>lO,<r} = a 

identically in cr, where 0 is the vector -with k components all of which are zero. 
We now prove a property which characterizes all D. This theorem is due to 
Neyman and Pearson [12], and is given here only for completeness. 

Theorem 1. The fraction of the surface area of the sphere 

k+l 

S Ej = c** 


which lies in D is a for almost all c 

Proof. Let a be any positive integer, h a positive parameter, and f/{y) a 
measurable function of y defined for y > 0 and such that 0 < \l/{y) < 1. In view 
of the distribution of SX?, it will be enough to prove that, if 


/i°+^ r 

r(a + 1) Jo 


dy 


a 


identically for all positive h, that then 

'P(y) = a for almost all y. 


Write 


( 2 . 2 ) 


a;r(a + 1) 


[ ’'’'dy =h' 

Jo 


-(a+ 1 ) 


Differentiating both members k times with respect to h and then setting h = 1 



542 


j. woi^Kowm 


we obtain the following result. The function 

SXaTT) 

is a density function wth /cth moment 

^ijb = (a + 1) (tt + 2) • • • (a -f- /). 
The momenta y.k are the momenta of the density funtsLion 


1 

r(a + 1) 


y e 


They satisfy the Carleman criterion [5, p. 19, Th.l.lO], and hence no essentially 
different distribution can have these moments. This proves the desired result. 

Theoubm 2 (Wald). Among all tests of the general linear hypothesis the analysis 
of variance test has the properly that, for all positive d, the integral of its power on the 
surface ?)’ = £? is a maximum. 

Proof. Let c be any positive number. We have only to show that if wo allocate 
to the critical region D of the teat the fraction a of the surface area of the sphere 

(2.3) 2!^ a;* « c' 


for which 




fc+i 

Ex? 




is as large as possible and that if we do this for all c, the desired ma.vimum of the 
integral of the power will be achieved. If C is as large as possible so is 


Ex? Ex? 

1 1 



Letai, ■ • • , Uh+i be any point on the sphere (2.3). Let db be the differential of 
area on the surface Then 


(2.4) [■■■ f fin, <t) db = (V^ <r)-‘*+" 



+ d^) \ 

2tr» / 



where z is the vector (ai, • ■ • , a*) and (17)'2 is the scalar product of the two 
vectors. This last integral is easily seen to depend only upon | z \ and to be 
monotomcally increasing in | 2 | . This proves the theorem. 



POWER OF THE CLASSICAL TESTS 


543 


Corollary (Hsu). Among all tests of the general linear hypothesis whose power 
is a function of rf only, the analysis of variance is the most powerful. 

3. The set theoretic structure of tests whose power is a function only of 
Wald’s result (Theorem 2) cannot always be extended, in its simple form, to 
tests involving the multivariate normal distribution, but this can be done with 
Hsu’s theorem (corollary to Theorem 2) In order to see what is involved we 
shall investigate the set theoretic structure of tests of the general linear hypothe¬ 
sis whose power is a function only of 

Let q{xi , • ■ • , a:*) be the set of points m the region D whose first fc coordinates 
are xi, • ■ • , Xk . Let A{xi, - - ■ , a:*,, cr) be the integral of 

with respect to a;t+i, ■ • • , x/c+i , taken over g(a:i, ■ • ■ ajj). We first prove the 
following: 

Lemma. Suppose the power of D is a function only of Then for two points 




Xi , • 

• • , X*i 

and 


t 

/ 



3^1, * 

' * * ) 

such that 


k 

k 

(3.1) 


Sx? 

= E 


1 

1 

we have 




(3 2) 

A(a;i, ■ 

■ • , Xfc , <r) 

= A{x [, 


identically in a, with the exception of a set of measure zero. 

Proof. Suppose the statement is false. Then under some orthogonal trans¬ 
formation T oixi, • • • , a:4 the region D would go over into a region D* with the 
following property: Let A* (a:i, • • , a;* , c) have the same definition for the region 
H* as A(a;i, ■ ■ ■ ,Xk, a) has for D Then on a set of positive measure we would 

have 

(3.3) A(xi , ,Xk,<r) ^ A*{xi, ■ ,Xk,<r). 

We shall now show that (3.3) results in a contradiction. We have 
(3 4) P(Dl77, <r} = 

identically in i). By the property of the region D, therefore, we have 

P{D\i,,a} = P{D\T-\<r] 

' The situation here is similar to that described m footnote 3 



644 


J. WOLFOWm 


and hence 

(3.5) = P(D*i,r.<rl 
identically in 7, Thus we obtain 

(3.6) J (25r/)“''''‘’ /l(a:i, • • ■ , Xk , ff) exp ~ (xt - i?f)^| rfxj • ■ • dx* 

s J (2jr/)"‘'''’’^*(xi, ■ - • , x*, ff) exp 1^- 2^1 (*■' ~ ’?')*! • ■ • dxk 

with the integrations taking place over the entire space. Differentiating both 
members with respect to the components of ij and setting >; =» 0, we obtain that 
the two density functions (for fixed <r) 

(27r(r*)'"*^^^’a~’/l(xi, • • • , X* , ff) e.xp 

and 

(27ra-“)~‘’‘'^’a"‘yl*(Xi , • • • , Xjt, c) exp j^- ™ 

have identical moments. We shall now argue that those moments satisfy the 
conditions of Cram4r and Wold [7, Th. 2], so that the two density functions ere 
essentially the same, in oontradiction to (3.3). The Cramdr-Wold theorem states 
the following: Let Yi, ■ ■ - , Yi, be k chance variables with a joint distribution 
function, and write 

X.„ - EYY . 

1-1 

Then the divergence of the series 

E ..-(1/Sn) 

Asn 

n*l 

is suificient to ensure that there exists essentially only one distribution which has 
these moments. We notice that the factor 1/a of course makes no difference. 
If we set A(xi, ■ ■ • , x*,, o) and il*(xi, * • • , x*, c) both identically unity and 
consider the resulting moments which enter into the Xsn, wo see that these 
moments satisfy the Cramdr-Wold condition. Now A and A* are <1. Thus, 
using the true value of A can servo only to increase the value of so that 

the series will diverge a fortiori. This proves the lemma. 

The following theorem helps to describe the set theoretic structure of tests 
whose power is a function only of X = 

Theorem 3. Let D be a test whose -power is a function only of X. Let u be any 
positive number, and D(xi, ■ • , Xk, u) be the fraction of the “area” of the sphere 
Sy_i Xi+y = w* occupied by points which are in D and whose first k coordinates 
arexi, ,Xk. If 



POWER OP THE CLASSICAL TESTS 


545 


(3.7) S a:? = S 

1 1 

then, excejjt on a set of measure zero, 

(3'8) D(xi, • ■ ,xi,u) = D{x'i, ■■■ ,x[, u). 

Proof. We shall show that, if the power of Z) is a function only of X, the 

failure of (3.7) to imply (3.8) would contradict the preceding lemma. Suppose 
then that (3.8) is not true on a set of positive measure. Under some orthogonal 
transformation on aii, • ■ , Xk we obtain* a function D*(xi , ■ • • , Xk, u) which 

differs from D{xi, • ■ , Xk, u) on a set of positive measure and such that, for 

almost every xi, ■ ■ ■ ,Xk, 

A{xi, • ,Xk,a) = K f D(xi , • • • , Xt., u)'^‘u‘~^ 

Jo 

= K r D*(Xk, ... , a:,, du 

Jo 

identically in <r, where JC is a suitable constant of no interest to us. Multiplying 
by a', differentiating repeatedly under the integral sign with respect to it, and 
setting <r = 1, we obtain the result that the two density functions in u, 

KDjxi , ••• ,xk,u) i_i 

A(a:i,- ,Xk,l)^ 

and 

KD*ix,, , Xk,u) i-i A-u^)n 

—71 -^ 

are identical except perhaps on a set of measure zero This contradiction proves 
the theorem. 

Theorem 4. A necessary and sufficient condition that the power of D be a function 
of X only, IS that, with the usual exception of a set of measure zero, D{xi, ■ ■ ,Xk, u) 
be a function only of 

St* 

u- 

The proof of this theorem is not essentially different from that of the preceding 
theorem, and we shall therefore sketch it only briefly. Let Z be a transformation 
on {Xi, • • • , Xk , u) = (x, u) which consists of a rotation of the vector x, followed 
by a multiplication of u and the components of a; by a positive constant c If 
D{x, u) IS not a function of 2{ x^ii alone, then, just as before*, we can use some 


® See footnote 1. 

' This statement implies that a function of *1 , • > • , xi, , u, which is invariant to within 
sets of measure zero under all transformations Z (the exceptional set may depend on the 



546 


J. WOLFOWITZ 


transformation Z to give us a function D*(a:, u) such that 

D{x, u) H D*ix, u) 
on a set of positive measure, while 

ED(x, u) = ED\x, u) 

identically in ij, a. This yields a contradiction in the usual manner and proves 
the necessity of the condition. 

To prove sufficiency, write D(x, u) = v(Sa:?/u’) = v{v). Let y(v, tj, a) be the 
density function of v. Then 

P{D\i),a\ - f v(v)y(u, 1 ), a) du. 

Jo 

By hypothesis, v(v) is a function only of v. We know [9, p. 140, eq. 101] that 
y(v, ri, tr) is a function only of v and X. Hence | 17 , tr) is a function only of X. 
This completes the proof of the theorem. 

Theorem 5. Among all tests of the general linear hypothesis which have the 
properties described in the conclusions of Theorems 1 and 3, the classical analysis 
of variance test is the most powerful. 

We shall omit the proof of this theorem, wliich is very similar to that of tlie 
more difficult Theorem 9 below. 

Theorem 4 above shows that there exist regions D which satisfy the conclusions 
of Theorems 1 and 3 and such that iP{I) 1 57 , tr] is not a function of X alone. It 
folloAvs that the content of Theorem 5 is greater than that of Hsu’s theorem 
(Corollary to Theorem 2 ). 

It is instructive to note that Hsu’s theorem follows almost immediately from 
Theorem 4 and the form of 7 ( 11 , X). For let X be fixed but arbitrary. One verifies 
immediately from the form of y(v, X) that 

y(v, X) 

yiv, 0 ) 

is, for fixed X, a monotonically increasing function of v. This, by Neyman’s 
lemma, immediately proves Hsu’s result. 

4. The multiple correlation coefficient. We shall now apply our methods to a 
multivariate test, For typographic ease we shall conduct the discussion for the 

2/211 

transformation), is n function of “Fi except on a set of measure ziero This statement 

would be completely trivial were it not for the exceptional sets; in any case it must bo well 
known to set theorists The author constructed an unnecessarily long proof of it, and 
believes that a more expeditious proof can be constructed using the ideas of [11, page 01, 
Theorem 111, and page 318, p. 7], Professor C. M. Stem of the University of California 
has informed the author that this result is a special cose of one established by himself and 
G H Hunt m a forthcoming paper. For these reasons the proof is omitted. (See also [13, 
page 27, Lemma 9 1]) 



POWER OP THE CLASSICAL TESTS 


547 


case of three variates, but the reader will observe that the procedure is really 
perfectly general. 

The chance variables i = 1, 2, 3, y = 1, ■ ■ , n, have the density 

function 

(4.1) giB) = B 1)"'^ exp E E iMi] 

)-l t,!-l ) 

where 1 ) 5 = { 6 ,i) is a positive definite (symmetric) 3X3 matrix, 2 ) i/„ is the 
value assumed by F„ . The null hypothesis Ho asserts that a given multiple 
correlation coefficient is zero, say that of Fi with Fs and Fs, i.e., 

(4.2) bi 2 = bn = ba = bai = 0 

The test is to be made on the level of significance a, i.e., if Bo is any matrix 
which satisfies (4.2), and if G is a critical region for testing Ho , then 

(4.3) P{G I Bo} = 

where the symbol in the left member means the probability of G according 
to ff(Bo). 

Write 

n 

nst, = E 

n-i 

S22 S23I 

S = ■ 

^832 S33 J 

Let M(cu , C) be the manifold in the 3n-space of 


2/11 , ■ ■ ' 7 7 ' ■ ' 7 J /371 

where sn = Cn, S = C. First we prove the following. 

Theorem 6 . Any region 0 which satisfies (4.3) must have the property that the 
fraction of the area of M (cu , C) which lies in G is a, for any positive cu and any 
positive definite 2X2 matrix C = {c„). (We remind the reader that exceptional 
sets of measure zero are not precluded). 

Proop. Let ^(cu, C) be the fraction of the area of M(cu , C) m G Recall 
equation (4.3) and the fact that Su, S 22 , S 23 , S 33 are sufficient statistics for the 
elements of Bo. On the manifold M{cii, G) the conditional density is uniform. 
Employing Wishart’s distribution [ 6 ] we conclude that 


(4.4) K' I V'(sn ,S)\Bo\N\S 

• exp ^ (huSu “b 522822 "b 2623823 4" 633833 ]^ dsn dsoo dsa ds 33 = ol 
where K' is a suitable constant which need not concern us Here the symbol 



548 


J. WOLFOWm 


“si” means identically in hi, bn, bn, bn , provided only that bn > 0, b® > 0 
bnbn — bn > 0. Of course Su is distributed independently of %, sj,, bm . 
Proceeding as in section 2 , we can, by differentiation with respect to the b’s, 
obtain all the moments of the a.j-’s. Now let the b’n take any admissible constant 
values. The moments of the a,/a are then seen to satisfy the criterion of Cramer 
and Wold [7, Th. 2 ], and consequently essentially uniquely determine the 
distribution of the Sa. The desired conclusion follows as before. 

The six parameters which uniquely determine the trivariate normal distribu¬ 
tion (of Fi, Fi, Vi) with zero means may be taken to be the following; 

1) The covariance matrix {o-.y], i,j * 2 , 3, of 7s and 7a. 

2 ) The partial regression coefficients , ft, of 7i on Ts and Fa. These are 
defined as follows: Let E(Fi | 7a = ys, 7? = y*) denote the conditional expected 
value of 7i, given 7a => ya, 73 == j/a ■ Then 


£!(Fi I 7a = ya, 7a = ya) = ftya + ftya. 

3) The conditional variance w’ of 7i, given Fi “ ya, Ta »! ya. 

The population multiple correlation coefficient .S of 7i with 7a and 7a is then 
defined by 


RV 

(T^») 


da o'aa 2ft da Caa + da o'M • 


The six parametcra above may be chosen arbitrarily, provided only that {cf/} 
is positive definite. R and co are, by definition, non-negative. 

Let yi be the column vector yn , • • • , y,™ ; let y[ be its transpose, and let y 
denote the point yn, Vu , • • • , l/m, yai> • • • , yu in 3a-space. Let z{y) = 
2(yi ) Vi , Vi) be the component of yi in the plane of ya and ya; let r “ | z(y) | and 9 
the angle between z and ya, measured positively say in the direction of ya. 
Finally let h be the absolute value of the vector yi — z(yi, ya, J/s). 

We intend now to investigate the set theoretic stnicturo of tests whose power 
is a function only of R, and for this purpose prove the following: 

Theohem 7. Lei H be a region whose power is a function only of R. Let 
V(h, r, 6, Sn , Ssa , Saa) be the fraction of the "volume” of the manifold on which 
h, r, 9, Sn , Ssa, Saa are fixed which ts contained in H. With the usual exception of a 
set of measure zero, for fixed h, r, sn , Sas , Saa , the quantity V above is constant for 
all 6. 

Later, after this theorem is proved, we shall write 7 without exhibiting 9, 
This procedure is justified by Theorem 7. 

Phoof. Suppose the theorem false, and proceed as in Theorem 3. A suitable^ 
rotation of the radius vector z(y) implies an orthogonal transformation T on the 
generic point y which leaves h, r, Saa, Saa, and Saa unaltered, and takes the region H 
into a region H* such that H and H* differ on a set of positive measure. T leaves 
72 invariant, hence leaves invariant R which uniquely determines the distribution 


* See footnote 1. 



POWER OP THE CLASSICAL TESTS 


549 


of R Hence an argument almost the same as that which led us to (3.5) yields the 
conclusion that the power of H and the power of H* are equal, identically in B. 
Proceeding as in Theorem 3, we obtain two essentially different density functions 
in h, r, d, $22 , S23 , S 33 , whose integrals over the entire space are identical in the 
elements of B. From these functions we obtain two different density functions in 
j = 1, 2 , 3), with identical moments (obtained by differentiation with 
respect to the elements of B). The rest of the proof is essentially no different 
from that of Theorem 3. 

Theorem 8 . In order that the power of H he a function of R alone, it is necessary 
and sufficient that, with the usual exception of a set of measure zero, V{h, r, Sn, Sja, S 33 ) 
be a function only of h/r (i.e., of R). 

The proof of this theorem is essentially the same as the proof of Theorem 4 
The place of the transformation Z is taken by one which consists of any linear 
transformation on the vectors ys and 1 / 3 , the addition of a constant angle to d 
(rotation of z(y)), and multiplication of the vector yi by a positive scalar c. 
This transformation leaves R invariant. In the proof of sufficiency we use the 
distribution of R (see, for example, [10, p. 384, equation (15 55)]) The remainder 
of the proof is essentially the same as that of Theorem 4 

Theorem 9. Among all tests H which have the properties described in the conclu¬ 
sions of Theorems 6 and 7, the classical test based on R is the most powerful 

As a corollary to this theorem we have the following result due to Simaika 
[3]: Of all tests H whose power is a function of R only, the classical test based 
on R is the most powerful. 

Simaika’s result also follows easily from Theorem 8 and the density function 
of R in the same manner that Hsu’s result followed from Theorem 4 and the 
density function of v. 

In the course of the proof of Theorem 9, the various symbols W, with or 
without subscripts, will denote suitable functions of the variables exhibited, 
and the various symbols k, with or without subscripts, will denote suitable 
constants. 

We have that 

P[H\B\= £ (2x)‘-“"'»^ 1 B r'* exp E yiBy}j dyn ■ ■ • dy»n 
= j (2x6)^“"^'^ exp { 2/1 - (/3!l/2 + fty3)}“J ' 

(46) {‘’’•j)) ■ ■ ■ dysn = (2x0) )^ ^exp^^ 

exp — { 2/1 + 2 ) 92/33823 -j- ^sSas) 

■ Wo(s22) S23, S33, {ffij}) dyn “ ’ dy%n. 
Now (/ 32 y 2 + ^iyz)'z is a function only of ^2 , Ps , S 22 , 823 , S 33 , r, and fl. Also 




550 


j. •wouowm 


/i’ -f. r’ =s sy 1= y\ , Thus 


P{J!r|5) =" j J", «ss, Sm, r, Sjj) fija, s®ai {P}) 

■{ 


exp /i (/Sa ?A 4- I 3 i f/a) 'aj fW 4/i rfr dsa dsn dsu * j" Vifi , r sm , Sji, Sm) 


J 

(4 6) ■ ' f ^ ^ ( 4 ;ir)“^ exp 03 a |/s + ) 3 a I/a) 's 

• (10 dh^ d/ dsn dSii dsa j V {Vyf^, r, Saa, Saa, s*,) 

' — r, Sn, Sn, Ssai ^a3/s)^2'| 

• do dr* dy? dsn dsn dsa, 

Integrating wth respect to 6 and designating 

If3 j exp ifisVs + ft2/j)'z| do 

by W{Vyl -?, r, Saa, «aa, Sja, {B}) \ve observe that just as in (2.4), W is 
monotonicaily increasing in r (all other variables fixed). Thus wo have 

(4.7) P {// 1 P) « JvWdr^ dyl dsn dsaa d8„ . 


In constructing H only the function f is at our disposal, and this subject to the 
limitations imposed by the conclusions of Theorems 6 and 7 and the fact that 
A* + r* = j/i = Sii. The function IF is not within our control at all. With y?, 
saa, Saj, Saa fixed, W is monotonically increasing with r. To maximize the power 
it is therefore best to distribute the "mass” so that f" is as large as possible for 
large values of r and hence of R. This implies the classical test and proves the 
theorem. 


6 . Stringency of the classical tests. Wald [ 8 l calls a tost Ti “most stringent” 
if the following is true: Let (T) be the totality of tests. Let 0 be the generic 
point in the parameter space, and P{r | 0} be the power of T at the point 0. 
Let Tt bo any test other than Ti. Then 

sup [sup P{r 10) - P{7\ j 0 )l sup [sup P(T [ 0 } — P[Ti\ 0 )]. 

« (r) « |r) 

Of course, we have omitted to specify the totality [Tj. One can admit all tests 
whose size ^ a, a given constant between 0 and 1 , or restrict one’s self to tests 
whose size is exactly a, We shall do the latter. 

Under these circumstances we shall prove that the classical test of a linear 
hypothesis is most stringent. Our proof will occupy but a few lines, and is an easy 



POWER OP THE CLASSICAL TESTS 


551 


consequence of the structure of the classical tests as described in the lemma of 
section 2 The result itself is a special case of an unpublished theorem due to 
G. H. Hunt and C M Stem, and all priority on this result is theirs 
Return then to the notation of section 2. Let tr be fixed at any arbitrary 
positive value, and the surface 


be that one on which 


ui{rD = supP{Tlij} - P{Lil7)} 

(Tl 

is a maximum, where L\ is the classical test of the linear hypothesis. It is clear 
that this maximum is actually achieved, and that wi( 7 j) is a constant on the 
surface ■q = cl. Let L 2 be any other test (of size a), and U 2 {r}) be the corre¬ 
sponding function for L 2 . We have only to show that on the surface r/ = co 
we cannot have everywhere uiiin) < o)i(iy), and our proof is complete. If everywhere 
on the surface = cl we had ui(n) < mi(i?), we would have, also on the same 
surface, PjLj | ?)} > P[Li\ri}. This would, however, violate Wald’s Theorem 2 
(section 2) and proves the desired result 

REFERENCES 

[1] P L Hsu, “Notes on Hotelling’s generalized T,” Annals ofMalh Stai., Vol. 9 (1938) 

p. 231. 

[2] P. L Hsu, “Analysis of variance from the power function standpoint,” Biormtnha, 

Vol 32 (1941), p. 62 

[3] J B SiMAiKA, “On an optimum, property of two important statistical testa,” Bio- 

meiriha, Vol 32 (1941), p 70. 

[4] A Wald, “On the power function of the analysis of variance test,” Annals of Math. 

Stai., Vol. 13 (1942), p 434. 

[61 J A Shohat and J D 'Tamarkin, The Problem of Moments, The American Mathe¬ 
matical Society, New York, 1943. 

[6] John Wishart, “The generalized product moment distribution, etc ,” Biometnha, 

Vol 20A(1928),p 32. 

[7] H. Ceam^h and H. Wold, “Some theorems on distribution functions,” Lond Math. 

Soc. Jour , Vol. 11 (1936) 

[8] A Wald, “Tests of statistical hypotheses concerning several parameters when the 

number of observations is large,” Am. Math Soc. Trans , Vol 64 (1943), p 426 

[9] P. G Tang, “The power function of the analysis of variance etc Stai. Res. Memoirs, 

Vol. 2 (1938) (Um.veraity.nf London), p. 126. 

[10] M G. Kendall, The Advanced Theory of Statistics, Vol 1, Charles Griffin and Com¬ 

pany, London, 1945. 

[11] S Saks, Theory of the Integral, (Second Edition), G E Stechert and Company, New 

York, 1937 

[12] J, NeymAN and E. S Pearson, “On the problem of the most efficient tests of statistical 

hypotheses,’^ Roy. Soc. London Phil. Trans , Ser A, Vol. 231 (1933), pp. 289-337. 

[13] Ebehhard Hope, Ergodentheone, Chelsea, New Yo|Epy'fe^1’^“‘‘ 



APPLICATION OF THE METHOD OF MDCTDPES TO QUADRATIC 
FORMS IN NORMAL VARIATES 

By Herbert Robbins and E. J. G. Pitman 
InBlilule of Slalislics, University of North Carolina 

1. Svunmary. The method of mixtures, explained in Section 2, is applied to 
derive the distribution functions of a positive quadratic form in normal variates 
and of the ratio of two independent forms of this type. 


2. The method of mixtures. If 

( 1 ) Fcix), 

is any sequence of distribution functions, and if 

(2) Cd, Cl, • • • 

is any sequence of constants such that 

(3) cy>0 (i •= 0,1, ...), Sc, «1 

(all summations will bo from 0 to «> unless otherwise noted), then the function 

(4) Fix) = Zcj Fj(x) 
is called a mixture of the sequence (1). 

It is sometimes helpful to interpret Fix) in tlie following manner. I/Ot J, Xo , 
Ai, ■ • ■ be variates such that J has the distribution =» j] =« cjif « 0,1, • > *) 
and such that A/ has the distribution function Fiix), Let A be a variate such 
that the conditional distribution function of A given J »= j is Fjix), Then the 
distribution function of A is 

P[A < x] = !?[/ = j].P[A < X i / = i] - Scj Pj(x) = Fix), 

This interpretation of F(x) will, however, not be involved in the present paper. 

The following statements are proved in [1], If x ■= (xi, • ■ • , x„) is a vector 
variable the function Fix) defined by (4) is a distribution function, and for 
any Borel set 5, 

(6) f dFix) = Ic) f dFjix). 

Jb V5 


More generally, if gix) is any Borel measurable function then 


f gix) dFix) == Icj f gix) dP/(x) 


whenever the left hand side of (6) exists. In particular, the characteristic function 

562 



METHOD OP MIXTURES 


553 


<p(t) corresponding to F(x)js 


(7) <p(t) = 

where (pj{t) is the characteristic function corresponding to Fj{x). 

If each F,(x) has a derivative fj{x) then F{x) has a derivative f{x) given by 

(8) fix) = 'ScJjix), 


provided that this series converges uniformly in some interval including x. 
Conversely, if (8) is the relation between the frequency functions and if the 
senes is uniformly convergent in every finite interval, then the relation between 
the distribution functions is given by (4). In practice we deduce (4) from (8), or, 
using the uniqueness theorem for characteristic functions, from (7). 

As regards computation, we observe that for any integers 0 < pi < P 2 and 
for any x it follows from (3) and (4) that 


Pl~l 


(9) 


0 < Fix) - X) <^jFiix) = 2 CjFjix) + X c,F,ix) 

0 JJ2+1 


Pi 


( Pi-1 \ / Pi-1 \ ^ 

X Cl) + aup - Xc, - Xcy) < 1 - X c,. 

0 / J>JI2 \ 0 t\ / PI 


The existence of these upper bounds (the last a uniform one) for the error term 
when the series (4) is replaced by a finite sum shows that series expansions of the 
mixture type (4) are especially well adapted to computational work. 

For some purposes it is useful to consider series expansions of the type (4) 
where the c, may be of both signs and where the series Sc, may diverge. Both 
parts of (3) will, however, be satisfied in the cases considered here. 

If U, V are independent variates with respective distribution functions 
Fix), Gix) we shall denote the distribution function of any Bore! measurable 
function HiU, V) by 


HiU, V) iFix), Gix)). 


Now if F(a:), Gix) are both mixtures, 

Fix) = hCjFfix), Gix) = XdkGkix), 

then by (5), 

P[HiU, 7) < a:] = JJ dFiu) dGiv) 


= IXc, dk ff dF,iu) dGkiv), 

{£r(u,i>) ^x) 


so that 

(10) HiU, 7)(ScyFj(x), XdkGkix)) = SScy 4 H(w, v)iF,ix), Gkix)). 



554 


IlERBEUT ROBBINS AND E. J. G, PITMAN 


As an application of the principles set forth in this section we shall express 
as series of the mixture type (4) the distribution functions of any positive 
quadratic form in normal variates and of the ratio of any two independent forms 
of this type., Special cases of the problem have been dealt with by Tang [2], 
Hsu [3], and many others, but the method of mixtures permits a unified and 
simple treatment of the general case. 


3. Distribution of a positive quadratic form. We shall denote by F„ (a;) the 
chi-square distribution function with n > 0 degrees of freedom, 

(11) K(x) = (x > 0), 

= 0 (i: < 0) 

The corresponding characteristic function is 

(12) pM = /’"e"' clF^ix) = (1 - 2^0-*" = u>‘", 

Jo 


where we have set = (1 — 2it)~*. We shall denote by Xn s.ny variate with 
the distribution function ( 11 ). 

Let a be any constant such that a > 0. The characteristic function of the 
variate o-Xn is 


(13) (1 - 2fai)"‘'' = [a(l - 2ii) ~ (a ~ 1)]~‘" « 
By the binomial theorem we have for any a > 0, 



(14) 


1 -^1 - » Icjs/ 



where 


(15) cj = a 


-in in(in + !)■ '-(in +J ~ 1 ) 


(' - =)' 


0 = 0 , 1 . ••■). 


For 0 > 1 we see from (15) that all the c/ are non-negative. Likewise for a > § 
(and hence d fortiori for a > 1 ) we have 11 — 1 /a |“^ > 1 so that (14) holds 
for all I 2 I < 1 ; setting z = 1 it follows that the sum of all the Cy is equal to 1 . 
Hence for a > 1, 


Cj >0 (y = 0, 1, • • *)» 2cy = 1. 


Since | la | = | 1 — 2U \ ^ < 1 for all real i it follows from (13) and (14) that 
for a > 1 , 

(1 - 2m0“‘” = = 2cj(l - 210^*"“' 


(16) 


— Scy VB 42 j(t)- 



METHOD OP MIXTURES 


555 


Hence for a > 1 the distribution function Fn(x/a) of the variate a-Xn is a mixture 
of X distribution functions, 

( 17 ) K(x/a) = XcjFn+ilix), 

where the c,, determined by the identity (14), are the probabilities of a negative 
binomial distribution. 

It may, in fact be proved by a direct analysis, which we omit here, that (17) 
holds for any a > 0. However, if o < 1 then the c, wiU be of alternating sign, 
and if a < i then the series 2c, will diverge This shows mcidentally that a 
relation of the form (4) can hold even though the series 2c, diverges and hence 
the corresponding relation (7) does not hold for < = 0. 

Theorem 1. Let 

^ = u(xm + Ol Xmi + ■ ■ ■ + Ur Xmr)> 


where the chi-square variates are independent and a, oi, • ■ • ,a, are positive constants 
such that 


Ut > 1 

Define constants Cj ly the identity^ 

( 18 ) n{aT*-[l-(l-‘->] 

then obviously 


i-hit, 


(f = 1, • • • , r). 
= 2c,g’ (|z| < 1); 


Let 


c, >0 0 = 0> 1> ' • •)» 2c, = 1. 

M = m + mi nir ■, 


then for every x, 

(19) P[X < x] = 2c,-^'M+2;(x/a). 

For any integers 0 < pi < Pi and every x, 

0 < P[X < x] - 2]c,I<'jsf+2,(x/a) 

PI 

( pi-i \ / pi-i \ 

c,) + F M+2piJri{x/a) ^ ~ 0 ~ p 

P 3 

< 1 — 22 cj • 

PI 

Proof. The characteristic function of X/a is, by (13) and (18), 

^{t) = 1^1 ~ ~ a)’"] } ^ = 'Lcjipa+ifit) 


^ If 7 " = 0 we regard the left hand side of (18) as having the value 1. 



556 


HERBERT BOBBINS AND B. J. G. PITMAN 


Hence for any y, 

P[X/a < j/J « £c, FM-iriiiv), 

whence (19) follows on setting a: « ay. Finally, since F(x) is a decreasing function 
of n for fixed x, (20) follows from (9). 

It should be' observed that the coefficients c/ determined by (18) can be 
written explicitly as the multiple Cauchy products 

where 


c,,/ = a 


-Im, \inX\m, + 1 ) 


• • - (K- + J ~ 1 ) . A __ 


(t ^ 1, > -. , r; j =» 0, 1, ■ • ■)* 


The cj may be computed stepwise by the relations 


C/ “ Cl./. 


cJ*' - E {cJlT” • Cm) 


<-0 


( s * 2 , ■ • ■ , r), 


cY* “ C/. 


4. Distribution of a ratio. The ratio xi/x* of two Independent chi-square 
variates has the distribution function 


(21) F„,n{x) « r(wr(^)^ 

» 0 (a: < 0), 

In computational work we can use the tables of the Beta distribution function 

Ur, >) - f v-' ■ (1 - u)-' -du (0 < X < 1), 


together with the identity 


r(r)-r(8)4 
0 (e < 0 ), 1 (x > 1), 

F m,n (®) “ I./u+*>(J>n, In), 


Theobbiuc 2 . Lei 

( 22 ) 

where the x’ variaiea are independent and a, Oi, • • -, Or, 6 i, • • • , ore posUive 


X =■ + oix:^! -b •■• + OfX^,) 

X'n + l>lXni + * ‘ * 4- i>«Xn. ’ 



METHOD OP MIXTURES 


657 


constants such that 


a. > 1, by > 1 


(i = 1, ••• ,r;j = 1, ,s), 


Define constants cj, dk by the identities 


'Jm, 


= Icj s’, 
= 2 42*; 


(|zl<l) 


then 


Let 


cj > 0, Xcj =1, dk> 0, 24 = 1. 


M = m + mi +■••+ Mr, N = n + ni + ■ • ■ + n, \ 


then for every x, 

P[X < a:] == ^^Cjdk-Fit+ij.N+ikix/a), 
and for any integers 0 < pi < p 2 ,0 < gi < g* and every x, 

P2 «a 

0 < P[X ^ a;] — 2 4 ■ 

(.-14 

Proof. Let U, V denote respectively numerator and denominator of (22). 
From Theorem 1, 

P[U < x] = Scy Fj,+ 2 ,(x/a), 

P[F ^ x] = ^ dkF jv+ 2 *(x). 

Hence by (10), for every a:, 

P[X < x] = P[U/V < x] = 2Scy 4'Fv+sy,«+2fc(x/o). 

The rest of the theorem is obvious. 

Corollary, Let 

2 

V _ Xif 

a I I 2 > 

axr + bxt 

where the % variates are independent and 

0 < o < b. 



558 


HERBEKT ROBBINS AND E. J. U. PITMAN 


Define 


a “ a/b, N >=> r s, 

e, „ «!• . . }} . (1 - «y O' - 0 , 1 , ...); 


then 


Cj > 0 0 «“ It * ‘^ 1 ) 

and for every x, 

P[X < *1 = Sc/ Fjif.w+syCtu*). 

For any integers 0 < pi < pi and every x, 

0 < P[X > Z] - Cill - jPv.y+w(oT)] 


(23) 


< [1 - ^’v.vCa:!-)] 


(?■■) 


+ (1 — FM,.vi ipi+2(a-x)j 

NO Pt / PI 

Proof, Except for (23) this is a special ease of Theorem 2. To prove (23) 
we observe that 


P[X > X] « 1 - P[X < xl - ScjH ~ F;.,,B+i,(aii;)], 

and since for fixed in and x, Fn,«(z) is an increasing function of n, (23) follows 
in the same way as (9). 

6. The non-central case. Lot F be normal (0, 1) and let >» (F -{- d)*, where d 
is any constant. The frequency function of X is, for x > 0, 

f(x) = (2irx)"^-e"“''*+*^(e‘'*‘ + 

By expanding the last factor into a power series it is easily seen that 

(24) /(x) = Spy/i+ 2 l(x), 

where /„(*) = Fn(x) is the chi-square frequency function with n degrees of 
freedom and where 

Ps « c~*‘*’'(^d’)Vi! 0 " 0,1, • • •)• 

Since the identity 

(25) = 2)p,a^ (alU) 

holds, it follows that 

py > 0 O' = 0, 1 ) •• ■). Sp/ = 1 , 



METHOD OP MIXTURES 


559 


The series (24) is uniformly convergent in every finite mterval, so that we 
can write the distribution function F(x) and characteristic function tp(t) of X 
in the forms 


F(x) = JiprFi+ij{x), 

<p{t) = 

where again we have set u; = (1 — 

Now let , • • ■ , F„ be independent and normal (0, 1) variates and let 

(26) X = (Yi + djf + • • • + (Fn + dn)°, 
where the d.- are constants such that 

dl+ ...+dl = d\ 

The characteristic function of X is then 
v(t) = 

and hence the distribution function F(x) of X is again a mixture of distribution 
functions, 

(27) Fix) = Sp,.F„+ 2 ,(x), 

where the pj, determined by the identity (25), are the probabilities of a Poisson 
distribution with parameter \ = We shall denote the non-central chi-square 
variate (26) by Xn% • 

We can now generalize Theorems 1 and 2 in a straightforward manner to 
cover non-central chi-square variates. We shall state only the generalization 
of the Corollary of Theorem 2 to the case in which the numerator is non-central. 


Theorem 3. Let 


X = 


n 

XM,d 


ax? + 5x? 


where the x variates are independent and 


Define 


C/c 


0 < a < h. 


X = 2 ^'", a = a/5, N — r s 

p, = 

= . ^S(|S + 1) • • • (^S + fc - 1) ^ 

k\ 


ij = 0, 1, •••), 
ik = 0,1, ••■); 


then 

Pi > Oj 2p, = 1, ct > 0, Set = 1, 



HERBERT ROBBINS AND E. J, G. I’lTMAN 


5(i0 

and for every x, 

P\X. ^ l] = SSpy C* f 
For any integers 9 < gi < Oi ,0 < h < h, 

0 < < ®] - £ i: pjc, ■ < (l - Spy) ■ (l - iic^y 

tfl \ PI / \ Ai / 

REFEUENCES 

[1] Herbbrt Robbins, ‘'Mixture of distributions,” Annala of J/af/i. SlaliUtca, Vol. 19 

(1948), p, 300 

[2] P. C. Tano, “The power function of the analysis of variance tests with tables and 

illustrations of their use,” Stal. Rea. Mem., Vol. 2 (1038), p, 126. 

[3] P, L. Hsu, “Contributions to the theory of‘Student’s’ t-test as applied to the problem 

of tw'o samples,” ibid., p. 1. 



THE JOINT DISTRIdUTION OF SERIAL CORRELATION 
COEFFICIENTS 

By M. H. Quenotjille 
Rotliamsted Experimental Station 

1. Summary. An expression for the joint distribution of serial correlation 
coefficients, circularly defined, has been derived. It has been shown that this 
distribution possesses properties similar to those already encountered in the 
distribution of a single serial correlation coefficient, i.e. it is definedby different, 
function forms for various subregions. The distribution thus found is of little 
use for computational purposes Consequently, approximate forms have been 
investigated and the suitability of the ordinary partial correlation coefficient 
for large-sample testing has been inferred. 

2. Introduction. Anderson [1] has derived the distribution of the serial 
correlation coefficient 



where the are normally and independently distributed with mean n and 
variance <r* and where a circular definition is employed, so that €„+, is defined 
to be equal to e,. However, in makmg a test of any series, we shall usually be 
faced with a set of serial correlation coefficients, so that we shall require a joint 
distribution function of ri, ?' 2 , • • • , ?*„ say; This distribution function is derived 
below by an extension of the method used by Koopmans [2] 

It should be noted that Bartlett [3] has shown that for large samples the 
variances and covariances of the n are independent of the distribution of e, 
under fairly wide conditions This means that the joint distribution function 
obtained for normal e, will often give a good approximation for non-normal e, 
and can be used as the basis for any test of the correlogram. 

3. Conditions on the n. It is easily seen that the n cannot take all values 
from -j-l to —1 independently. For example, 7*2 cannot take a value near —1 
if ri takes a value near -(-1. As a result, there will be certain necessary conditions 
that the ri will have to fulfil. It is not difficult to find these conditions, since, if 
yi{i — 1,2, • • • , n) are any set of variables, then 

(i) 2] (e,+,- ’■j yi yi+i . 

where e, may or may not be corrected for the mean and the double-sufiix sum¬ 
mation convention is employed. 


661 



562 


M. H. QUENOUII/LE 


Thus, provided 0 < m < n/2, we will have 



Fio. 1 


as a necessary condition that the right-hand side of (1) be positive definite and 
this expression will impose necessary conditions upon the joint distribution 
of the rj. 

Fig. 1 gives the limits of possible values of Ti and rj subject to (a) no restriction, 
(b) r» = 0, (c) n = u = 0. 


4. Complex Integration in m Variables. Before finding the joint distribution 
function of the r, some introductory remarks on complex integration involving m 
variables will be necessary. 







SERIAL CORRELATION COEFFICIENTS 


563 


We can evaluate an integral such as 



where 3{a^ = 0 and /(% , , Zm) is regular in the region ^(z,) > 0, by 

successive Cauchy integrations, so the integral has a value (27ni)’7(oi, ■ • ■ , a^). 
In the same manner as for Cauchy integration, it will be possible to distort the 
contours over which we integrate so that we can evaluate 

. /(gi ■ • • gm) 

i;“J fife -«.) “■ 

,=i 

provided that /(zi, • ■ • , Zm) is regular in the region defined by S, and (ai, ■ • ,am) 
is enclosed in this region 

More generally, if we have an integral of the form 



and we make the transformations w, - o„z, and b, = a„c,, i.e. W = A^, 

C = A~^B, it is possible, in the above manner, to evaluate the integral as 

(3) 

Suppose we now consider the integral 

f... f 

® II 

,-i 

where n '> m. We may select a set, Qk, of m equations = b ,, and let 
Ak = [a.,], Bk = [5J, Ck = Ai;"Bk = [cdt]. Then, we may carry out the integration 
as previously, in this case, summing a series of terms for various combinations 
of w equations out of the possible n. The value of the integral may then be 

written 

where the summation occurs over the points (c.t, Ctk, , c„).) lying 
region defined by S, and the product term excludes the set of equations . The 
ambiguity of sign in (3) and (4) arises from the Jacobian | Ak 1 ,_and the sign 
must be chosen which makes the transformation of dsi, • • ■ , dzm yield a positive 



M. H. QDBiOtnUt/B 


element. It must be noted timt it is possible to obtain several expansions of the 
form (4) according to the convention that is employed in defining “enclosure” 
for each of the variables. 

6 . Integral form for the joint distribution function. We can, without loss of 
generality, assume o-’ =« 1, Suppose that 

P W S «< " ^52 y/n, 5" “ 2 

where ei,«», • • • ,«» are independent, so that n “ ^j/p. Then by a consideration 
of n dimensional apace, we can see that p is distributed independently of ri, • • • r„ 
so that their joint distribution can be written g[p)h(ri , * • ■ , r„)dp dn, • • ■ ,dr„. 
The joint distribution of p and gi, • • • can thus be written 


dp dgi ■■■ dq. 


(5) /(p 3 i ••■?„) dp dgi • • • dq„ = h(-, ■ • • dp dqx • • ■ dq„, 

p" \p p / 

where it is not difficult to see that 

( 6 ) ff(p) w -- /rrr A • 

We can now find the joint distribution of p and qi, • • ■ , gm by inverting the 
characteristic function of these variables. This is given by 

{ 2 ^ [~T ’ 


where 


1 / 1 A |‘, 


«' = [ej , Q , • • • , *„] 


1^ I ~ (1 ~ 2t7j — 2i0jKn), 

« (1 - ii (1 - KjKu), 

1-1 


2iry I 

*/( =■ cos —, 
n 

2iBt 

“ 1 - 2x7, ’ 


so that the joint distribution of p and gi, • ■ • , is 

1 r 1 

Sip, Qi’ ■ -Qm) => L ‘ ■ j mi df) d$i‘’- de„ 




(27r)-»+‘ 


exp < - 


Ja, [Tfl 

(l — 2x7j)icygy\ dKi - dK,„ 

2 i f2x)« 



SERIAL CORRELATION COEFFICIENTS 


565 


where S„ is the region bounded by k, = ± 


2 tt» 


. Now S, can be replaced 


1 - 

by region S enclosing the same set of singularities on the real hyperplane, and S 
can be chosen independent of i). Thus it will be possible to reverse the order of 

r“ 

integration in (7) provided that / 11 — 2ir) converges, i.e. provided 

"—00 

n > 2m + 3. Then since 


i £ (1 - 2^7;) exp (-t7)(p - KjQj)] dn 


■j4(n—2m —l)-p /"yi 2??1. — l\ 

' H-2- ] 


exp l-§(p - K,?,)} for p > K,qi, 


= 0 for p < Kjq,, 

we get 

fiP, Si ■ ■ ■ 9m) = 




( 8 ) 






iv - K,9))^^"~^"~- 


rn-i 

l^n (1 - 


1 dxi * ' ' dKm , 


where S encloses the same singularities as S,, all of which lie in the region 
P > If "*ve now use (5) and (6) we get 



In a similar manner, it is possible to derive for n > 2m + 3 the joint distribution 
of serial correlation coefficients, f jl , • • • , fm, uncorrected for the mean, in 
the form 


( 10 ) 


h(n ■ • fm) = 



(1 - 

- n -1} 

n (1 — KiKji) 

_!-l J 


(Zki ■ • • dKjfl 


6 . Extension for variables in an autoregressive scheme, Madow [4] has shown 
how to extend tbe distribution of the serial correlation coefficient for uncorrelated 
variables to the case when the variables Xi are connected by a linear Markoff 
scheme, x, = pa:._i + e, with a normal distribution of the error e. It is worth 



566 


M. H. QOBNOUILLE 


noting that the method used by Madow can ixj applied to derive the joint 
distribution of serial correlations of variables Xi, which are connected by a linear 
autoregressive scheme of order m, or less, 

0(13:1 + Oilr-i + * * ’ + “ *(> 

where ei, • • • , e„ are normally and independently distributed, and e„+, »= e, } 
Under these conditions, the expression (9) will bo modified by a factor 

»(n-H 

1 1 

I (4 + ’ 

where 

A = tal 

M 

tn~~i 

-B,' = 13 mflit-t-y, 

*-o 

while ( 10 ) will be modified by a similar factor with'n replacing n — 1 . 



7. Reduction of the distribution function integral. Using the method described 
in section 4, it is now possible to reduce the integral given in (9), if we observe 
that (tyt = Ksn~i and assume n odd. We then have 


hirv ■ - rj 



(1 — 

li (1 ~ 

/-I 


dxi • • • dKm 


( 12 ) 


_V 


1 I 

r Kk 

»(-. 


^{n — 2m — 

IW 



1 

I 


V 2 


n 

ll^k 

*'yt 

Kk 


where I = (1, 1, • ■ , 1), r' = (n , r*, • • • , r„), Ky, = (kji , • ■ • , ««,) and 
Kk is the matrix formed from a sot gi, of the m matrices Jyi arranged in order. 
The factors in the summation can most easily bo determined if we put 
' 1 / 

^ « A(ri, ■ • • , rm-i) — r„ and sum over tlie region for which < 


■4(ri, • • ■ , rm_i). To demonstrate the manner in whicli formula (12) works, wo 
shall consider m = 2. Trom formula ( 2 ) we can see that a limit to the possible 
values that n can take is given by n = 2rl — 1 i.e. by the curve (cos d, cos 26) 


= Xl . 


^ Ttia is a auflBcient condition for Xn+i 



SERIAL CORRELATION COEFFICIENTS 


567 


in the (ri, n) plane. It is not dfficiilt to see that there are possible terms in 
(12) and that each of these terms is proportional to the — 2m — 3)th power 
of the distance from a line in the {ri, planes. These lines are the joms of the 
points (cos cos i = 1 , ■ • ■ , -Kn — 1 ) and the joins of such points 

on the curve (cos Q, cos 20 ) give the outer limits of the possible values of ri and n . 

It can also be seen that these points correspond to the equations KjKji = 1 (each 
of these equations determines a plane in 4-dimensional complex space), while 
the joins of these points correspond to the singularities defined by and terms 
arising from pairs of these equations. Furthermore, smce the sum of residues m 
any plane is zero, the sum of contributions, taken with appropriate signs, arismg 
from lines through any of these points is zero, i.e. the sum of all possible terms 
involving any particular Kji will disappear. This leads to several possible 
expansions for h{ri, • • • , r™). 

If we consider the particular case n = 9, then each term in the expansion (12) 
is proportional to the distance from one of the lines joining (cos 2iri/9 cos 4m/9), 
i = 1, 2, 3, 4 These lines may be denoted by U,. Then the contribution from 
Z,y is given by 

2 _ KuKij — (ku + xij)ri + 2(^*2 + 1 ) _ 

(k1, — Ku)(ku — Kik)(Ku — Kl.)(«l, — Kjk)(Klf ~ Ku) ' 

where j > i and Kia = cos . 

3 

The values of this expression are: 

In , - 1.979 + 2.938 ri - 1.563 , 

lu , 0.926 - 2.106 ri + 3 959 n , 

lu , 1 053 - 0.832 n - 2 396 , 

ki , - 5.012 - 3.959 n - 6.065 , 

hi, 3.033 + 6 897 n + 4.502 , 

hi , - 4,086 - 6.065 n - 2 106 n , 

where, for example, the contribution from In acts m the region for which 
1.563 ra < — 1 979 + 2,938 ri. Fig. 2 demonstrates the configuration for this 
case It IS seen that the frequency surface is a tetrahedron. As particular ex¬ 
amples of the identities mentioned above we have 

In + hi + ^14 = 0 , 

— Z 12 -h hi + hi — 0 , 

— Zi 3 — Z 23 + hi = 0 . 

For a general value of m, we shall find that the hyperplanes joining sets of m 
points (cos 2 Ti/n, cos itn/n, ■■■ , cos 2 irrm,/n) will be singularities on the 



568 


M. H, QOESOUILI/E 


frequency hypersurface. The hyperplanes passing through sets of m successive 
points will give the limits of possible values of n , • * • , r„ . B’urthermore, the 
sum of contributions (with appropriate signs) to the frequency function from 
the set of - 2m H- 1) hyperplanes passing through any point will be zero. 

8 . Integral approximation for the distribution function. The expression (12) 
is, of course, difficult to use in practice and we require an approximation similar 
to tlrnt of Koopmans. For this wo make use of the integral expression (10) 


r» 



hr the joint distribution function of fj, • • • , and approximate to the factor 

(1 — xyK/i)^ ‘ This can be done without undue difficulty, but tlio resulting 

multiple integral docs not appear to be capable of easy reduction. This is hardly 
surprising, since from the nature of the distribution of the r,- wo should expect 
this approximation to involve Rn raised to a suitable power, and this conjecture 
is strengthened by the following considerations: 

a) The distribution of fi may be obtained by considering the two sets of 
observations a:i, 0 : 2 , • • • , , Xn and xj, xj, ■ • • , Xn , xi as unrelated, and using 





SERIAL CORRELATION COBEFICIENTS 


569 


the distribution of the ordinary correlation coefficient corresponding to n + 3 
pairs of observations. (Dixon [ 6 ] Quenouille [7]). In the same manner, the m sets 
of observations cci , Xz ) * * * j —i) ^2 j ^3 > * *' j j * * ? Xm+t j * * j 

Xm-t, ^m- 1 1 might be considered as unrelated and the joint distribution of their 
correlations, given by Garding ( 6 ), will involve raised to a suitable power. 

b) The outer limits for the joint distribution of ri, ra, ■ ■ , r™ or fi, fa, ■ ■ , fm 

for large n, will be provided by the equations Rp = 0, {p = I, ■ , m), An 
investigation of the properties of the functions, Ri, R 2 , ■ ■ ,Rn might therefore 
be expected to throw light upon the joint distribution of ri, ra, • • , or 

ri) fs > ■ ■' ) , 1 / 

c) Rp is a quadratic in rj, and may be put equal to Rp-t{Tp - rp)irp — Tp), 
where r'p and r'p are functions of ri, ra, • • •, Tp-i giving the limits of the values 
that Tp can take for any particular values of ri, ■ • • , r^-i. Let Qp = Rp/Rp-i, 
then Qp is likewise a quadratic in rp, taking all values between rp and rp and 



rp)’(j-p - ?•")■ dTp 


B(s + 1, / r'p - r'; Y'+' 

QU A 2 ; • 


But, by expanding Rp as a bordered determinant, it is not difficult to show that 
r'p - r'p = 2 Qp-i, so that 



r(s + 1) 
r(s + I) 


Q 


*+i 

p-i 


In particular, if 

r(%n + !)• • ■rC^n — m + 2) _ J_ ^U7>-277.+i) 

(13) firi-•■rm) - + i)- • •r(^ - m + f) 

and if we integrate with respect to r™ , r„-i, • ■ rj in turn, we get 

f \ J ^ _ r(^n + 1) ,, _ 2vltn-l) 

J ... J fin • ■ • r„) dr™ • • • dr 2 - ^ 

which is the approximate distribution of the first serial correlation coefficient, 

uncorrected for the mean, as given by Dixon [6]. , 

The importance of this lies in the fact that the integral correfepondmg to that 
of Koopman’s for the joint distribution is 


Tijn) 

V{^n - m) 





1 ^ I 1^”—^^ 


0 

L- 

HrXl jj 

J 1Y |‘" fi 

sm 

A «(*,) y 
dxi a J 


dxi • ■ • dxm 



570 


M. H. QUENOUILLB 


where J"' = [n , ■ • • , r„], 


X = 


cos Xi cos 2:s • • • COS Xm 

COS 2xi cos 2Xi coe 2x„ 


L cos mxi cos mx 2 
I = [1,1, ... 1], 


cos mx„ 


Y = 


1 

cos Xi 


1 

cos Xi 


1 

cos Xv 


L cos (m — l)a;i cos (vi — l)a:s 
K'(fl) = [cos 6, cos 28, ■ • ■ , cos md], 

and S is the region given by- 


cos (m — l)a:m J 


1 7 
r X 


> 0. This suggests, by analogy, that the 


joint distribution function is a polynomial in rm of degree 2(^n — m — 1) + 3 = 
— 2711 + 1 which vanishes only when Rm = 0. The equation satisfies these 
conditions, and in addition, it reduces to the known form when m = I and can 
be integrated to give this same form. Thus there is a strong suggestion that (13) 
gives an approximate distribution of n , n, • • ■ ,rm, uncorrocted for the mean. 

An alternative form for the constant factor in (13) may be obtained if we 
note that 


r(in — m + 2) 


r(n — 2 to + 3) 


r(ia - m+ |)irl [r(in - m + .J)]’* 

d) Now r'j, and r, can be written in the forms (Sf-t + i2p_i)//2p-a and 
(Sp-i — Rp-{)/Rp-.i, where 


Thus 



ri 

ri 

rs 

... 0 


1 

n 

Ti 

• • • Tp-i 

= (-1)’^' 

ri 

1 

n 

• ■ • Tp-i 


2 

rp-i 

r p~4 

n 

, / Sp-i + Rp~i 


(r . 

Sp^i ~ 1 

V /i!p. 

-2 

Tp) 

\Tp 

'^lipSi 

!p-l Tl Rp-2 




L V 

R. 

p—1 

)i 



where 


Qp — Qp-i(l — rip (.123. .) 

ri,p+i.2s. = Tp-i/Rp-i , 






BEWA.L COEIIELA.TION COEFFICIENTS 


571 


and 


^i> h h fp-i 

rp-i 1 n 

Tp-i = fp-j Ti 1 


h rp_2 rp_3 ••• 1 


Therefore, if we make a change of variable to ri,p_i,23., ri,p,23, -, ■ • • 2, »’i, 
we find that the new variables which correspond exactly to partial correlation 
coefficients are, in fact, independently distributed as such, with 3 degrees of 
freedom more than in the case where the sets of variables are distinct observa¬ 
tions. 


\Yliile the above properties do not prove that the r, or f, may be tested 
using partial or multiple correlation coefficients, this conjecture has been verified 
elsewhere and it has been shown [ 8 ] that, with certain adjustments, a test can be 
derived which is applicable to fairly short series. 


EEFERENCES 

[1] II b. Andehson, "DiBtribuUon of the eenal correlation coefficient,” Annals of Math 
m., Vol, 13 (1942), pp, 1-13 

[21 T. Kooi’mans, "Herinl correlation and quadratic forms in normal variables,” Annuls of 
Maf/i. Slat., Vol. 13 (1943), pp. 14-33 

[3] M. B. lUaTLETr, "On the tlicorclical specification of sampling properties of autocorre- 

latcd lime Berios,” Roy. Slat, Soc. Suppl, Vol. 8 (1946), pp 27-41. 

(4) W G. Maddw, "Note on the distribution of the serial correlation coefficient,” Annals 

of Math. BlaL, Vol. 16 (1945), pp. 308-310 

[6] L Gakwno, Proceedings of Lmd Universtly Mathematical Seminars, Vol, 6, pp 185-202 
[Cl W. J. Dixon, ''Further conlributionB to the problem of serial correlation,” Annals of 

Math, m., Vol, 16 (1914), pp. 119-144, 

[7] M, H. (iuENOiuiAE, "Some results in the testing of the serial correlation coefficient,” 

Mmcirtto, Vol. 36 (1948), pp. 281-7. 

[8] M, H. QcENOiiiLiiB, ‘‘Approximate tests of correlation in time series 1,” Roy Stat Soc 

%pI.,Vol.ll (1949). 




ON THE ESTIMATION OF THE NUMBER OF CLASSES IN A 

POPULATION' 

By Leo A. Goodman 
Princeton Universiiii 


1. Sununaxy. This paper deals with the following problem: Suppose a popula¬ 
tion of known size N is subdivided into an unknown number of mutually exclusivo 
classes. It is assumed that the class in which an element is contained may he 
determined, but that the classes are not ordered. Lot us draw a random sample 
of n elements without replacement from the population. The problem is to 
estimate the total number K of classes which subdivide the population on the 
baeis of the sample results and our Imowledge of the population size. 

There is exactly one real \'alued statistic ti which is an unbiased estimate of K 
when the sample size n is not less than the maximum number q of elements 
contained in any class. The restriction placed upon q is unimportant for many 
practical problems where either there is a reasonably low bound for q or those 
classes containing more than n elements are known. An unbiased estimate does 
not exist when there is no such knowledge. 

Since the unbiased cstinaate can be very unreasonable, modifications of S are 
considered. The statistic 


T' 


S' 


N- 


N{n - 1) 
n{n — 1) ’ 



if'S" >2**, 

i“i 

if S' < 


where a:, is the number of classes containing i elements in the sample, 
is the moat suitable estimate, in comparison with three other statistics, for a 
hypothetical population. 

The ease where each element in the population has an equal and independent 
chance of coming into the sample is used as a model for some sampling procedures 
and also as an approximation to the case of random sampling. 


2. Introduction, The problem discussed may be described in tei-ms of colored 
balls in an urn. How should we estimate the number of colors present in the urn 
on the basis of both the .sample which gives the number of, say, white balls, rod 
balls, etc., and our knowledge of the total number of balls in the urn: 

The following practical cases illustrate some of the ways in which tliis problem 
presents itself: 

(1) A company has received a large number of requests for a free sample of 
its product. It is loiown that the same people often send more than one request. 


’ Prepared in connection with research sponsored by the Office of Naval Research. 

672 



573 


E0TIMA.TION OF NTJMBEB OF CLASSES 

From a sample of the requests we wish to estimate how many different peoole 
have sent requests. ^ 

(2) The Social Security Board possesses a large collection of Social Security 
cards. It is known that some people obtain different cards when they change 
jobs. From a sample of the cards it is desired to estimate how many different 
people have Social Security cards.® 

(3) A person who sells durable commodities anticipates opening a store 
which is to bo located at a highway intersection. He would like to Imow how 
many different automobiles pass through the intersection in a given time period. 
The total number of automobiles may be easily observed but some probably 
pass through more than once. This type of inquiry is also useful to advertising 
agencies which must decide the most efficient location for billboards. 

(4) The State Unemployment Compensation Board possesses a large list 
of the people receiving unemployment benefits. It is desired to estimate the 
total number of families benefiting from the insurance program on the basis of a 
random sample of the people named on the list. 

(6) The number of words in a book may be easily estimated and a sample can 
be taken. The problem of estimating the number of different words in a book is 
another analogue of the general problem.'* 

3. Results and derivations. In order to show that an unbiased estimate of the 
number of olasses in a population exists when the sample size n is not less than 
the maximum number q of elements contained in any class, we need prove the 
following two statements: 

Lemma 1. Suppose we have K classes of JV similar ekmenis with nt elements in 
class 1, nj elements in class 2, • • • , njc elements in class K. The class of an element 
is readily identifiable when the element is examined. Let 

q = max (n,). 

Suppose a random sample is dravm wiihovi replacement. If Xi is the number of 
classes containing i elements in the sample, and K, is the number of classes containing 
j elements in the population, then 

Pr(t ly, N, n)Kj, 

where Pr{i \ j, K, n) shall henceforth be an abbreviation of 

0 * 

ca ' 

»SubnviUed by Cbarlw Callarcl to question and Answem, The Amefiean Statialician, 
Vol.3,No. l,p.23. 

’ Mentioned to the author hy Dr. J. Stevens Stock of Opinion Research Corporation 

* Mentioned in letter to the author from Frederick Mosteller of Harvard University. 



574 


LEO A> GOODMAN 


Proof, Let y, be the numl>er of elements appearing in the miraple from the s-th 

K 

class. The statement is proveil by consitlexinK E(x,) « where 


fl. if 2/- 


'j “ I, 


Lemma 2. Let 


a**' ^ 


iO, if y, 5^ i. 

|a(o - !)(« ~ 2) • • • (n - t + 1), for f > C), 
11 , for i =» 0. 


If 


ihen 


a, BO 1 — (— 1) 


i (IV — n + i “ 1] 


(•) 


n' 


(0 


2:ri. Pr(i I i.iV.n) » 1.* 


This result follows directly from the fact that 

(„1)« r,'i „ n 4- f - 11"-” - 0, for y ^ 1 . 

tMO 


The following theorem may he proved dircrtly by the preceding lemmas: 
Theorem 1. Suppose, a sample of n clmerUs is drawn vnthoul replacement from a 
population of size M which is subdivided into K classes, Let 


A.' 


^[0 


If there are x, classes containing i elements in the sample, then 

provided that n is not less than the maximum number q of elements contained in 
any class in the population. 

Theorem 2. There is at most one real valued statistic which is an unbiased 
estimate of the number of clcLsses in a population.^ 

Proof. Lot us order the points of the sample space in the following manner: 
Letting a:, be the number of classes containing i elements in the sample, order 
the sample points by increasing values of ; for equal values of ain , order the 
points by increasing values of Xr,~i ; for equal values of a;„-i, order the points 


‘ The author is indebted to Professor Frederick P. Stephan of Princeton University for 
a statement leading to asiinplifioation of the original result. 

• This statement was mentioned to the author by M. P. Peisakoff of Princeton University. 



ESTIMATION OF NUMBER OP CLASSES 


575 


by increasing values of i ■ • • ; fur equal values of 0 : 3 , order the points by 
increasing Xi. Let 

n 

1-2 


To prove the theorem, we must show that to each 0, there corresponds a 
unique value S{i), which must be the value of our estimate when 0 < is observed, 
in order that the statistic be unbiased. To each 

Oi = xiii), ■ ■ ■ , 


let us associate the population 


Pi = 



Xi{i), Xiii) ,x 



If Pi is the underlying population, then 0i for all i > 1 will occur with a proba¬ 
bility of zero. Since there are N classes in Pi, the value of the statistic must be 
S(l) == N whenever 0 i is observed in order that the estimate be unbiased. 
The theorem may now be proved by induction. 

Since all the Pi used in the proof of Theorem 2 satisfied the condition that the 
maximum number g of elements contained in any class be not more than the 
sample size n, the statistic S is the only real valued statistic which is an unbiased 
estimate when q <n. 

Wlien the restriction that g < n is removed, it is useless to search for an 
unbiased estimate since we have 

Theorem 3. There does nol exist an unbiased estimate of the number of classes 
subdividing a population mhen it is not knovm whether the maximum number q of 
elements contained in any class is not more than the sample size n. 

By the preceding theorems it is clear that if an unbiased estimate exists it 
must equal S. However, S is generally not unbiased when n < g. 

Theorem 4. Suppose the statistics Si, Si, • ■ ■ , S„are the solutions of the system 

of linear equations 


X, = 2 1 3 ) 2 - 1, 2, • • ■ , Ti, 

where x, is the number of classes containing i elements in a sample of size n from a 
vomlation of N elements. If Kj is the number of classes contammgj elements in tU 
population, then E^Sj) - Ki , for j = 1, 2, • • • , n when n is not less than the 

maximumnumberej of elements contained in any class 

Proof, We observe that the statement is certainly true fory - g + 1 , g -h 
.. • , n, since 

E{Sj) = K,' = 0, for j = g + 1, g + 2, • • ■ , n. 

The statement is also true for j = q, since 



676 


r,Kr) A. OOODMAN 


To prove that E(Sj) * K,, for aii 3 'J < q, we assume it to be true for all i > 
whereupon ite truth for j follows, 

B 

By Theorem 2, and 3, it is clear that 23 “ >'»'• Since 

jmI 

“ A3 

i-i 

it seems reasonable, to ask ^rhether the values of the pstimaUis *S’i, iSg, ■ > • , Sn 
are in agreement with the known value of the size of the population. The unbiased 
estimate of K can be shown to he internally eonHistcnl by 
Theorem 5. Suppose a sample of size n is drawn without replacement from a 
population of N elements which is dioided into classes. If x,- is the number of classes 
containing i elements in the sample, ami if the linear equations 

1-1 

arc solved simultaneously for Sj , then 

tfSi ^ N. 

j-i 

The theorem follows readily from the fact that 

i IhCt I f, N,n) « n4f and S ^ »• 

J-l o' (-1 


The variance of S may now be calculated by means of the formula 

cl ^ £ AiAjUij => 23 m,i{i,f)K,Ki 

+ iz j) ~ m.M, ;■)]A'A , 

•-•I J 

where utj is the covariance between Xi and x,, is the covariance between 

iiv, and 3yv, when r h,nr ^ s and n* =» t, and m,{i,j) is the covariance between 
and Ssy, when n, = s. 

Since the statistic S can be very unreasonable, we consider other possible 
estimates of K, The statistic 


S' 


AT 


may be sliown to be a modification of S which replaces the number x/ of classes 
containing i > 2 elements in the sample by an additional ixt classes, each 
containing only one clement. Since the values of K{ for i > 2 are relatively small 
in the practical problems of Section 2, S' might be used as an estimate. 

Another statistic which may be used to estimate K is 

JV A 



E3TIMA.TI0N OF NUMBER OF CLASSES 


677 


This statistic may be shown to overestimate K whenever q 7 ^ 1. The estimate 

s" = 1 :®. 

t-l 

underestimates K when n < N — m where m is the least number of elements 
contained in any chuss. 


4. Binomial sampling. Let ua suppose that each element from a population 
of N elements has an equal and independent chance p = 1/r of entering the 
sample s. In this case, the size of the sample obtained is a random variable t) 
which is biuomially distributed with mean Np. If a large random sample of n 
elements is drawn without replacement from a large population of size N, then 
the results when intei-preted in terras of binomial samples where p = 1/r = n/N 
are a good approximation to the results obtained by the usual model. Binomial 
sampling may be eonsidered a model of the case where one attempts to obtain 
the sampling ratio p == l/r by drawing simultaneously an uncounted sample of 
elements which is estimated as being of the appropriate size. 

In the COSO of binomial sampling, the statistic 


« X) /i.x„ where B,- = 1 - (1 - r)' 
1-1 


may bo shown to lie an unbiased estimate of the number of classes in a population 

from which binomial samples are drawn. , . c/ r .u . ..r 

Ut ua now consider the statistic which corresponds to S for the case of 

binomial aampling; i.c., 

B' = AT - r%. 


It may be shown tliat 

BX/f') = + Ki + 2 L? - <^2(1 - Py • 

Hence, the statistic B' will underestimate K whenever 

p < 1 - 0 , for j = 3,4, • • ■ , g. 

Since 

-r 

1 • e ■ fnr 1 '> 2 whcB V > ovcrcstimates, and when 

is a decreasing function of j for ^ wnen v 



578 


I.EO A. OOODM*.?? 


B' underebt)inat{>s the value of K. UTien p is sueh that 


1 



< 


i 


the expected value of B' is brought closer to K by undenveighting some Kj 
and overweighting othora. 


6. A hypothetical population.’ Suppose, wt* draw a random samplo of 1000 
elements without replacement from a population of 10,000 elementH where 

Ki - 9225, Ki “ 330, A"? « 33, A* « 1. 

Hence, if 9595. By means of Table 1, let UK now compare on the basis of 
binomial sampling the catimates whieh Imve iHHjn i)rej«;uted in the preceiling 
sections. Since A' and n arc large, the.se resuItK are a gotKi approximation to the 
case of random sampling without rc'plaeoment. 


TABLE 1 


EMimale 

Expected cnlue 

Eias 

jx/.Ucrtii fiijuarc Error 

s 

9505 

0 

1 347 

B' 

9570 

-25 

j 207 

.S'" 

9909 

351 

! 490 

A'" 

995 

-8599 

J HIHK) 


It is clear that the best ostimateH of the numlx'r of claws in this i)articular 
population are »S' or B', since ,S' has the least bias, B(.S') — A', and .S' Inus the 
letist moan square error, E(B' — A')’. One might argue that both .S and .S" are 
the statistics whieh are capable of giving nonsensieal estimatefi. However, we 
may decide to modify .S’ or .S' in order to always get roaaonahle estimates by 
using the statistics 

s, if A > ,s > 

1-1 



if .S' > .Y, 


if 'S’ < 2 

i-i 

Imi 


if B' > 2 X, 

! 


1 ^ 

n 

2 2:,, 

if B' < 2 .T, 
1—1 


’ Other e'camples have been investigated by Frederick Hosteller in Questions and 
Answers, The American Statistician, Vol. 3, Ko. 3, p. 12. 



ESTIMATION OP NUMBEB OP CLASSES 


579 


Although those modified statistics T and T' are not unbiased, they have the 
desirable property that 

AhHEiT) < MSE(S), and MSE{,T') < MSEiS'). 

Since this hypothetical population is a plausible one for the practical problems 
of Section 2, the modified statistics T or T' seem, therefore, to be “best” for 
estimating the number of classes for these problems, where the “best” statistic 
is defined as the one which never gives unreasonable estimates and has the least 
mean square error. 

The author wishes to express his appreciation to Professor John W. Tukey 
whose suggestions were very helpful. 



CONCERNING COMPOUND RANDOMIZATION IN THE BINARY 

SYSTEM 

Bv John E. WAUtit 
The Rand Carpamtion 

1. Summary. L^t u« ctiiwider a sot of approsimaUily nwcitira binary digits 
obtained by some experimentaJ procmi. This paiHir outlines a method of com¬ 
pounding the digila of thk set to obtain a jaualler mi of binary digite which is 
much more nearly random. The method prmmted has the propffrty that the 
number of tligits in the compounded set i» a reasonably large fraction (aay of the 
magnitude | or i) of the original number of digits. 

If a set of very nearly random decimal digits ia required, thia can be obtained 
by first tinding a set of very nearly random binary digits and then converting 
these digits to decimal digits. 

The concept of ‘'maximum bias” is intnxiuml to measure the degree of 
randomness of a aet of digits. A small maximum bias shows that the set is very 
nearly random, 

The question of when a table of approximately random digits can be considered 
suitable for use as a random digit table is investigated. It is found that a table 
will be satisfactory for the usual types of situations to which a random digit 
table is applied if the reciprocal of the number of digits in the table is noticeably 
greater than the maximum bias of the table. 

2. Inttoduction and discussion. With the, development of thes theory of games 
and the more widespread, use of experimental metlwxls for determining approxi¬ 
mate distributions for statistics whose prolmbility laws are difficult to obtain 
analytically, a demand for large sots of random digila has arisen. The problem of 
obtaining a set of digite which can bo considered sufficiently random for the 
situations to which it would bo apphed, however, is not an eaay one. One approach 
to this problem consists in obtaining a set of digits by some procedure and then 
applying tests to this set of digits to determine whotiier it can be considered 
satisfactory. Although appropriate choice of the tests may result in acceptance 
of sets of digits which are suitable for certain special types of situations, this 
approach is of a negative character and does not prove that a given set of digits 
is sufficiently random; it merely indicates that this may bo the case. Wliat is 
needed is a constructive approach to the problem, i.e., a method of constructing a 
set of random digits which can be proved sufficiently random for most applica¬ 
tions if certain intuitively acceptable conditions are satisfied. A step in this 
direction has already been taken by H. Burke Horton in [1] and by H. Burke 
Horton and R. Tynes Smith III in [2], This paper presents what is hoped will be 
another step in this direction. 

In this paper, considerations will be limited to the case of binaty digits. The 
reasons for this are twofold; 


S80 



JOHN E. ■WAiSH 


581 


(a) . The method used for compoundmg the digits yields a sharp upper 

bound for the maximum bias of the compounded set (i.e., a bound that 
the maximum bias could actually attain) only for the case of binary digits. 

(b) . Many of the experimental procedures for obtaining approximately 

random digits consist in first producing binary digits and then converting 
to another number base. Thus binary digits are produced directly. 
Hence, to use the results of this paper, the only modification required in 
these procedures would be to compound the binary digits before they 
are converted. 

Now let us consider some definitions: A set of random variables each of which 
can assume only the values 0 and 1 will be referred to as a set of binary digits. 
For convenience, each of the random variables making up a set of binary digits 
will be called a binary digit; this is not to be confused with the value obtained 
for the random variable. The absolute value of the deviation from ^ of the 
conditional probability that a specified binary digit has the value 0 (or 1) is 
called the bias of that digit for the given conditions on the remaining digits of 
the set. The maximum bias of a binary digit is defined to be the maximum of the 
biases of that digit with respect to all possible conditions on the remaining 
digits of the set. The maximum bias of the set is the greatest of the maximum 
biases of the digits of the set. A set of binary digits is said to be random if its 
maximum bias is zero. 

The method used to prove that a set of compounded digits has a sufficiently 
small tYifl yimuTn bias is somewhat similar to the situation encountered in mathe¬ 
matics whore one begins with certain axioms and then draws conclusions, If the 
axioms are correct, the conclusions are necessarily valid. The first step in the 
compounding procedure consists in obtaining a set of binary digits by some 
experimental process (perhaps from a random digit machine which is based on 
some physical principle). The experimental process is so chosen that there is no 
doubt that the set of binary digits produced satisfies the two conditions: 

(i). The rnuxirmitn bias of the set is less than or equal to some specified 


value a(<i). 

(ii). The digits of the set can be arranged in a specified array which has the 
property that the rows of the array are statistically independent. 

On the basis of these two assumptions (which play the same role as the axioms 
mentioned above), it can be proved that the maximum bias of the resulting 
compounded set of binary digits never exceeds a specified value which depends 
on a. Moreover, the upper bound for the maximum bias of the constructed set ot 
binary digits cem bo mode extremely small even for large values of “• 

If the experimental process is suitably chosen, conditions (i) and (u) can be 
satisfied beyond any doubt. For example, let us consider 1000 people located m 
different parts of the world and not in contact with each other Let each person 
flip an ordinary coin high in the air so that it will land on a flat hard surface, 
record the result (say 0 for a taU and 1 for a head), and then repeat 
until 6000 binary digits are obtained. If a is set equal to 3/10, condition (i) is 



,W2 roMrotr.HiJ RWUtiMiKAtioN; 

obviously witisfipd for the irsiiltiiig, wf tif 5.fXKl/KK) birsary diKst**. f’iimlition (ii) 
oviclrnlly hoick if the army is taken to ronsif-t of ISTO navs where each row 
containK 5(XK) binan- cligifa rtbfainwl one poiretn 

The ideal rhoice for a would bp I,lip iiPtual maximum bias of the wt of binary 
digits obtained fntm flip experimeatal prowRu. 'llien the fomrmunding proredure 
for obtaining a aet of digits with a ajarifiMl upiw*r bound for the masimum bias 
would l>e simplified; also the nurnljer of digits in the rornpoundc^l set would lie a 
larger fraetion of the original numlier of digits. Invariably, however, the proper¬ 
ties of the experimental prtieccss are not known with suffieient aeeuraey for 
obtaining anything but a safe upper Iwmnd on the maximum bias of the set of 
digits produced. This situation is analogous to that of estimating the length 
of a stick xvhich a voiy” rough measurement has shown to la* ahtmt 10" long. 
Although one might be very hesitant to beliiwr* that the length of the stick lies 
between 9.9" and 10.1", the conUnition lliat the lenglti lies Ixitween 5" and 15" 
can be accepted w'itli virtual certainty and any logiejil roneli^ioim based on this 
contention can also be aeeepl«l with virtual certainty. 

Given the number of binary digits in a set and the maximum bias of the set, 
is it possible to delenniiie whetber the set is suitable for use n« a set of mndom 
binary digits? An important cousideration in answering this epmstion is the uso 
that is to be marie of the set of digits. This must ahvnys be taken into accoiuit 
before the suitability of the set can lie derdderl. For example, if no more than 
1/1000 of the digits of the Md are to Is* used for any parlieular situation, the 
set might be satiafaetoiy for tlic types of eases It) whieli it woukl lie applied; 
on the other Imnd, tlie set might not lie suitable for rases of fh(« types if all the 
digits of the set arc used for each situation. This example calls attention to 
an important point, namely that the suitability of a act of binary digits depends 
on the number of digits in the set. Let a set have a fixed non-sero maximum 
bias p. If the set contains a sufficiently large numlier N of digits, relations and 
expressions involving the digits of the set can bo found whoso probabilities, 
moments, etc., can differ greatly from the values which would be obtained if the 
relations xvore based on the same number of truly random binary digits. As a 
specific example consider the relation 

AU the digits of the set have the valius zero. 

If the reciprocal of the number of digits in the sot is of the same order of magni¬ 
tude or smaller than the maximum bias of the set, the ratio of the probability 
of this expression to its hypothetical value can differ noticeably from unity. 
Thus, at least in certain special cases, a nceessaiy condition for the suitability 
of a set of binary digits is that 1/iV > > p. This condition, however, is also 
bufficient for most situations to which a set of random digits would bo applied. 
The approximate sufficiency of the condition is a direct consequence of the fact 
that any set of N binary digits can be considered as a sample value from an 
.Y-diraensional population consisting of 2*^ discrete points. The 1/N > > p 
restriction implies that the probability concentrated at each of the 2^ points is 



JOHN E. "WAIjSH 


583 


very nearly equal to the hypothetical value of for all possible conditions 
on the remaining digits of the set. 

The 1/N > > p condition is very satisfactory from the viewpoint of proba¬ 
bilities. The probability of any relation based on a subset of the digits of the set 
(possibly conditioned on other digits from the table) can be interpreted as the 
sum of the probabilities of those points included in a certain region (defined by 
the relation) of the iV-dimensional probability space of the set of digits. By 
expanding (4 d= p)^ it can be shown that the ratio of the probability of any 
relation based on one or more digits from the set to the corresponding value for a 
truly random set of digits will be very nearly equal to unity 1/N > > p 

It 18 evident that the higher order moments of an expression based on one or 
more digits of the set can differ noticeably from its hypothetical value even if 
1/jV > > p; auy deviation from the ideal situation, no matter how small, can 
become important for high order moments. For the first few moments, however, 
deviations from the hypothetical values are not appreciable since these moments 
are based on the probabilities at the 2*^ points in the iV-dimensional probability 
space and these probabilities are very nearly equal to the hypothetical value of 
(4)" in all coses. 

The above discussion shows that the values of N and p are sufficient to deter¬ 
mine whether a set of binary digits is suitable for use as random bmary digits 
for a wide variety of situations. Analogous considerations apply for digits to any 


number base. _ „ 

A magnitude definition of the relation 1/iV > > p is difficult to specify. If p 
is the upper bound for the maximum bias of a set of digits obtained by the 
compounding procedure outlined in this paper, however, it seems that a reason¬ 
able condition would be that 1 /N > 50 p. This condition implies that the 
probability of any relation based on digits of the set can not differ from its 
hypothetical value by more than approximately 4%. In most practical 
applications the value obtained for p would be noticeably greater than the 
true value of the maximum bias of the compounded set. 

Since the maximum number of digits which can be taken from a table is the 
total number of digits in the table, the above considerations suggest that a 
random digit table should be constructed so that the reciprocal of the number o 
digits in the table is noticeably greater than the maximum bias of the table. 
Any table having this property would be satisfactory for most situations to 

which it would be applied, sp+s 

Now let U8 consider two different compoundmg methods which produce sets 

of binary digits with the same upper bound for the maximum bias. If the com- 

pulational di fBouWea of applying the two method, ate of ““^‘"“'^“‘1“J ' 

it MOms iea.onable to prefer the method whioh the 

For example, if the number of digit, in the Kt V ft* ' 

only 1/8 of the original number of digit, while the number m 

bv the second method is 1/3 of the original number, the second method wou 



584 


COMPOUND lUN»0Mt?.AT10N 


The compoundiag method piwnU*fl in this paper lim the property that the 
number of digits in the componndai Ret can fxj heJd to a rewonabiy large fraction 
of the original number of dipts at the same time that the upper bmxnd for the 
maximum bias is made extremely etnall. 'fhe methwl prc«pnt«i by Horton in [1} 
does not have this property. For example, let a “ I/IO. Applying Horton's 
method, when the compounded »et con«»t« of 1/8 of the original number of 
digits the upper bound for Hie maximum biaa is I2.S X lO ’. The example 
presented in seotion 3, however, shows that a oompounded set whoee number of 
digits equals 1/3 of the original number and which has an upper limit of 11.7 X 
10“^ for the maximum bias can be obtained using the metlual pnseented in the 
next section. 

Although the componndmg method outlined in section 3 is prwented as a 
series of steps, the value of a digit of the compounded set can be w'ritten as a 
hnear function (mod 2) of digits of tho original set. This wa« not done in what 
follows because of the complicated nature of the general form of such expressions. 
In any particular ease, however, these expreasions can be written without much 
trouble and the compounded difpts computed from the original digits in a 
single step. 

3. Outline of compounding method and statement of dieorems. This section 
contains a description of the compounding method mentioned in tho preceding 
two sections as well as statements of tho baaic theorems concerning this com¬ 
pounding method. Proofs of tire results stated in this section arc given in section 4. 

Let us consider the array of win binary digits 

Xii , Xit , • • • , X|« 

x« , In , ••• ,xu 

( 1 ) ... 


which satisfies conditions (i) and (ii); i.e., the maximum bias of the set (1) is 
less than or equal to a while a digit a:„, is independent of a digit Xn if r u 
(if r =» u, however, is not necessarily independent of Xm). 

Let a now set of (nt — l)n binary digits 

(2) yij , (t « 1 , >. • , wi ~ 1; j *» 1 , > <' , n) 

be formed as follmvs: 

ytj “> »«/ + xij (mod 2), 

(» « 1, - • ■, ffi — 1 i j - 1, ' • •,«). 

Then the biases of the have the properties 
Theorem 1. Let Uhea specified set of t ~ 1 of yn , > • ■ , j ■' • * 

yt.m-\)i , (1 < t < wi — 1), while V is a specified set of zero or more of the Vm’b 



JOHN E. WAESH 


585 


with q 5^ j. Also let 9 consist of the set of integers such that p eetfv • «77 Then 
ify^ - maxtmum Mas for the set (u = i, . . ^ ’ 

I PriVi: = 0 I {J, y) - i I < [1 - 7*)/(% + 7*)] 


/[I + IKi ~ yk)/{h + Tt)] 

kt} 

for all possible selections of U, V and of the values for the digits of these sets 
Coronary 1. If exactly t ^ 1 of y ,,, ... , 
known values, the maximum bias of the Unary digit yi, is less than or equal to 
a[l - a- a)V(J + «)']/[! + (i - «)'/(! + a)‘]. 

CoROLEAHY 2. The maximum Mas of the set (2) is less than or equal to 
a[l -a- + (I - + an 

The basic operation in the method of compounding binary digits is outlined 
in the procedure given for obtaining the y.-y from the x^v . Let m = (1 + ii) ■ • • 
(1 + tx)‘ Then a set of ■ t^n binary digits can be obtained from the original 
set of mn digits by continually applying this basic procedure. The first step 
consists in dividing the rows of (1) into (1 + fe) • • • (1 + tx) sets each consisting 
of (1 + ti) rows in some specified fashion. Each of these sets is an array of 
(1 + ti) Xn binary digits for which the rows are independent. Apply the method 
used to obtain the j/<y from the »«, to each (1 + Ji) X n array separately. Then 
each array yields a set of tin binary digits and there are (1 + < 2 ) • • ■ (1 + tx) 
such sets. In each set arrange the tin digits into a single row in some specified 
manner. This furnishes a new array of [(1 + ( 2 ) ■ • • (1 + tx)] X [tin] binary 
digits for which the rows are independent. Repeat this procedure with respect to 
k thus obtaining a new array of [(1 + tj) • • • (1 + t*)] X [tiM binary digits for 
which the rows are independent; etc., until a (1 + tx) X (ti ■ • • <jt-in) binary 
digit array for which the rows are independent is obtained. Then form a set of 
binary digits F,* , (? = 1 , • • • , <*:; A = 1, • • • , <1 • • • tx-in), from this array in 
ejcaetly the same manner that the y,f were obtained from the a:«„. Then the 
biases of the have the properties 
Theorem 2. Lei Po, Pi , • ■ ■ , Pk be defined by p^ = a and 

P. « + |3»-i)‘-]/[l + a - 

(u) = 1, • • • , K) 

Then, if exactly t ~ I of Yn , ‘ , F(j_i)fc, F (,+i)ft , ,Ytxk have known values, 

(1 < t < tx), the maximum bias of the digit Yos is less than or equal to 

Px-i[l - (i - pK-iY/i^ + i3ic-0']/[l + (^ - + px-k)% 

In particular, the maximum bias of the entire set of Y,h is less than or equal to 
Pk . Also 

Px-i[l - a- pK-i)‘/(i + 73ic-i)‘]/[1 + ih- Px-i)‘/(i + ^K-O'l 

oJC—1 9 4 .2®"* 

< ^ ^ ■ t ‘ 4-1 ■ 4-2 ■ • ■ «2 • ti -a , 


(3) 



58Q 


rOMTOUN’O RANDOMIKATIOJ} 


The inequality (3) ia fr«{uently upcrful from & mmputational viewpoint. 
Although the right hand side of (3) is iisually iioticmbJy greater than the left 
hand aide, in many cases this rough upper knind is iifielf mmll enough to show 
that the upper bound for the maximum Idas is of thn dmtt^l order of magnitude. 

It the RBt of compounded digits is to l>e used for a randfjm binary digit table, 
Theorem 2 shows that advantage can Iw taken of llip pt^ilion t)f the digits in the 
table, Ijet M » q ■ • ■ tjt-ju and enter the. valuea of the IV , (g « 
h " I, , M), into the table in the order 

y^uI Fit , • • •, Vitt, y^ti, , ViM * Fa ,"•». ■ *■ ( . 

Then, if aset of digita is taken fromthUtahleineonseeutiveorderlFufollowa 
y»jcM)i the upper bound for the maximum bias of this mt ia dependent on the 
numl« 2 r L of digits in the set. From Theorem 2, the maximum bias of a set of L 
digits taken in consecutive order from a table formed in this manner is 1 g« 8 
than or equal to 

dx-itl -a- dx-i)'/(i + )9x-t)V(l + (I + #*-!)'] 

for values of L such that (1 — \)M < L < tM, where 1 < t ^ tx < Tlius, if a 
enaall set of digita is taken from this table in conserutive order, the upper bound 
for tlie maximum bias of this set will usually t*c notieeahly amaller than the 
upper bound for the maximum bias of the table. Since many ust»« of a random 
digit table require only a small fraction of the total number of entries in this 
table, this property would seem to be durable. It slmuld be emplmsiiWKl, how« 
ever, tliat the maximum bias of a set taken from this table j» always less than 
or equal to fix irrespective of the positions that the digits of the seta occupy in 
the table. Thus nothing ia Icat by eonatructing the table in this manner but 
aomething can be gainecl for email seta if the digits are taken from the table in 
consecutive order. 

Now let US oonaider situations in which it ia required that the numlier of 
digits in the compounded set is at least a specified fraction, say 1/C', of the 
original number mn of binary digits. This requires that K md h , • • ■ , fx be 
chosen so that 

fi ix/a + k) •••(! + M l/c. 

Also, for given values of K and C, it seems preferable to choose fi , • • •, 1* so that 
the value of px ia at least approximately minimised. KxammaUon of the results of 
Thcol'em 2 indicates that a reasonable method of determining the values of 
h, • ‘ • , <x with this in mind consists in first choosing k as small M p«ible, then 
(given the value of fi equal to its minimum value) choosing k aa small aa powible, 
etc. This method is also recommended by the fact that the resulting valuM of 
h, • • • , are readily determined. The explicit procedure for finding h , * • • , <x 
ia given by 

Thbohem 3. Let the values of the integer K and the constant C (_> 1) he given and 
consider the integers k, > • - , fjc subject to the condition 

k ■ ix !(1 + h) • • • (1 + lx) > l/d. 



JOHN B. "WALSH 


587 


The minimum value of h is the smallest integer satisfying 

ti > 1/(C - 1). 

In gmcral, 2 < w < K - 1, having already determined k, ... as their 
minimum values^ the value of tut is the smallest integer satisfying 

iu > l/[Cil . • ‘ <u_l/(l + <i) ■ ■. (1 tu-i) — 1], 

Finally, given li , ■ • • , i/t-i as (heir minimum values, the minimum value of tx 
is the smallest integer satisfying 

Ik ^ l/[Ck • • • i/c-i/(l + <i) • ‘ ■ (1 + tx-i) — 1]. 

Now consider the general situation encountered in the application of the 
compounding process outlined above. Here the values of a, C are given and it is 
reejuired to choose K and hi * • • i so that the upper bound for the maximum 
bias of the compounded set of h • ■ • txu binary digits Y,k is less than or equal to a 
specified value h. The following procedure furnishes a method of solving this 
problem: 

Ix't K — 1, obtain h according to Theorem 3, and then compute jSi . If < 6, a 
solution has been obtained. It > fa, let iC = 2 and repeat the procedure to 
obtain jSj. IE |9j < fa, the values ot k, U and X = 2 are a solution. If /3j > h, 
repeat the procedure for X = 3; etc. In practical situations, the value of K is 
usually bounded (e.g., by independence properties of the original set of digits). 
If ifi Htill gK'ater than fa for the maximum permissible value of K, no solution is 
obtained. This moans that either fa must be increased or 1/C decreased or both 
if a solution is to be found. In many cases, a large amount of computation can be 
avoided by using the inequality (3). For marginal situations, however, a solution 
may V)C missed by using (3) instead of computing j3x. 

Example of method. The following table represents an example of application 
of the above method; 


a » 1/I0_ 

l/C - 1/3 

b = 2 X 10-* 

X = 1, 

»= 1 

(3, = 2 X 10“* 

X = 2, 

4 ^ 1, (j = 2 

^2 < 1.6 X 10“® 

X = 3, 

h “ 1, (j =■ 3, * 9 

/3, < 1.04 X 10“* 

X ®* 4, 

li “ 1, Ij » 3, Ij ^ 10, (i = 44 

|34 < 1,17 X 10"'. 

Thus K 

™ 4| 1^1 ^ ^ ^ 10| ^ 44 IS fl solution* 


4. Derlvattonfl. The purpose of this section is to furnish proofs of the results 
stated in the preceding sections. 

4,1 Proof of Theorem 1. Let us consider the conditional probability that an 
arbitrary but fixed ya has a specified value when the values of a fixed subset of 
zero or more of the remaining y's are known. For convenience, assume that yn 
is the binary digit considered and that the values of y-a. , j/ai, • • ■ , Vu (where t 
is a fixed integer such that 1 < < < m — 1) and a set S ate given while the 


688 


COMPOIWD R,\NDOMI2ATION 


values of the remaining p'a are unknown. Here »S' reprweats an arbitrary but 
fixed Bet of zero or iwjre of the for which J > 2 while t « 1 has the inter¬ 
pretation that none of the y,j, (i > 2), are given, f^et 

Pr(Zmi » 0 I <8) ■*" ^ + fflf+i and Pr(x/n «» 6* ] iS) »» | + «* i 

(k - 1, 

Then, using the indei>endence oonditiona aatiafied by the 


PKyn 


hi I l/n ** 6a, ‘ ■ I yn " h(; <8) 

“(+1 (+1 "1 /r'ii «*i 

(i + on) + |I (^ - + «») 4- I|(I “ «i) 

ri+i i+t *1 /r<+> tn 

i ^ J / [ jjj 

^ -f- «i s. 


Now I fi 1 » (1 - P)/(l + P) if 0 < P < 1 and equals (P - 1)/C1 + P) if 
(4*1 

P > 1, whore P « (i " «*)/(! + »*)• Let Y« be the maximum bia« for the 
sot of binary digits Xm , • ■ • , Xub , (m “ 1, • * • , w). Then it is easily soen that 

t (+t 1 /r *'*■* 1 

1 ” Q (i " yit)/(i + IJi ■*" ‘ 

Thus 


Pr(y,i =■ 6i i i/si « 6j, • * *, yu »* 6(; <5) — i [ 


< n 


- (.f-i n / r (+1 

1 -• Q (i — Y*)/(i + 7*) j 1 + (J — 7*)/(i + 7*) 


for all possible selections of 6i, • • • , 6i and all possible selections of & and the 
values for the digits of S, It is to be observed that this inequality is \'alid for t» 1. 

Evidently this result can be modified to apply to an arbitrary ytj for which 
t - 1 oi yxj, • • • , y(i-i)j, ycf+i )/1 ■ • * ( Virn-m have given values. This obvious 
modification results in Tlieorem 1. 

4.2 Proof of Theorem 2. By Corollary 2, the maximum bias of the ((1 + ia) ■ • * 
(1 + <jr)] X [fin] array is less than or equal to . In general, 2 w £ iC, by 
Corollary 2 the maximum bias of the [(1 4 iw+i) • ♦ ■ (1 4^ <«)] X [fi • < • <»»] 
array is less than or equal to /3„ . Finally, by Corollary 1, if exactly f ~ 1 of 
Fu, ' ■ • , F(j_j)a , F(,+i)a , • • •, have known values, (1 k), the 

maximum bias for the binary digit F,* is less than or equal to 

0x-i[i -ih- pK-iym 4- dx-a)V[i + (^ - + pjc^iYi 



JOHN E. WALSH 


589 


The inequality (3) is an immediate consequence of the relation 

. a[l - a)7a + «)']/[l + a - aY/ih + a)'] < 2sa\ 

4.3 Proof of Theorem 3. From the given condition 

U > l/[Cli • ■ • tK-i/a + <i) • ■ ■ (1 + iK-i) - 1]. 
From this inequality for Ik it follows that 

Cfi ■ • • + h) - ■ • (1 + iK-i) - 1 > 0. 


Thus 

£jc_i > X/[Cti * • • £k-j/(1 + £0 ■ ■ ‘ (1 ~ £ 1 - 2 ) "■ 1]- 

In general^ 3 < lo < K - I, given 

£„ > l/[Cii ■ • • £^i/(l + £ 1 ) ••• (1 + £„-i) - 1] 


it^followB that 

C£i £.^i/(l + £ 1 ) •'•(! + <»-i) - 1 > 0 


whence 

iu-i > l/lC£i • • • £u,-j/(l + £ 1 ) ‘ • (1 + 


Finally 


t, > 1/(C - 1). 


references 

[11 H. Borkk IIobton, “A. method for obtaining random numbers, ” Annals 0 / Math. Slat., 
VoL 19 (1048), pp. 81-86. 

rai H BtoM Horton and R. Tynhs Smith HI, "A direct method for producing random 
[21 H. Math. Slat , Vol. 20 (1949), pp, 82-90. 



THE DlSTRIBimON OF EXTREME VALUES IN SAMPLES WHOSE 
MEMBERS ARE SUBJECT TO A MARKOFF CHAIN CONDITION 

Bt Benjamin Epstein 

Department of Mathmalics, Wayne t'nkwily 

1. Introduction. The extreme value pnihlem an treatwl in the literature 
concerns itself with the following queslion; To find the {listrilmtion of the 
smallest, largest, or more generally the vih largest, <tr j»th sinallmt values in 
random samples of size n, drawn from a diKtrilnition whcw«* prulmliility law is 
given by the d.f. Fix). In this fonnulation the ol««'r\wl sample valure ii, • ■ ■ , jr„ 
are ossumetl to be statistically independent. Whiln the iiwumption of inde¬ 
pendence may be a goexi approximation to the Inie state of affairs in gome 
cases, there are situations where this assumption is not justified. 

Suppose, for instance, that the olt8er\'ntion8 in the sample are, ordernl in lime. 
Then it may happen that successive ohsen’alions are stoehsistieaily dependent, 
the extent of this dependence being a function of the time interval separating 
these observations.* In such case.s the present distribution theory for t'xtreme 
values in samples of size n is imwleituale anti tmiHi be replaewl by more general 
results. 

It is cleuir that a clean-cut analytic solution to the problem of the distribution 
of extreme values in samples whose members may Iw* sloeboslically dependent 
can be expected only for certain special kinds of tlependeiice among successive 
observations. Wo are able, in this {laper, to obtain the distribution of arnallest, 
largest, second smallest, and second largest values in samplcfl of size n drawn at 
equally spaced time intervals from a stationary 4Markoff process. 

2. The distribution of smallest and largest values in samples of size n drawn 
at equally spaced time intervals from a stationary Markoff process. In this 
section the follomng assumption is made: 

(A) observations xi, xi, • • • are taken in order at times t » 1, 

i = 2, ■ • ■ , t = n, ■ ■ • from a stationary Markoff random process. 

The only information needed in the investigation of a stationary Alarkoff 
process at integral values of time is the function 

(1) Fj($, y) » Prob (a-. < x, < y), 

independently of i, where Fiix, y) must ho such that tlio marginal distribution 
obtained by integrating over a: or y (if a-f or x.'n lake on a continuous range of 

‘ If the obBorvationa xi,xt, ■■■ , .t„ , • • ■ are taken at dmorotq timesa (i, /i ' 

a moaauro of stochoatio dependence between xi and x/ is the ordinary ooellieient of eorrela- 
tion fit . If the observations arc taken from a continuoua sloohaalte process a natural 
measure of atoohaslic dependence between observations made at two different times is the 
covariance function of the process. In this paper we shall limit ourselves to processes which 
are discrete in time. 


600 



EXTHEME VAEUEa 


591 


values) or summing over the possible values of x, or a:.+i (if x, and a:.+i can take 
on only discrete values) is of the form 

(2) ^i(^) = Prob (x, < x), 

i 

independently of i. 

An example of a random process meeting condition A is furmshed by the 
Ornstcin-Uhlenbcck process [1; 2]. In this case the joint df. of x, and a:,+i is 
given by a non-singular bivariate Gaussian distribution. The results in the 
present paper are stated completely in terms of the d.f.’s F^ix, y) and Fi{x) 
defining the stationary Markoff process and will in particular be valid for observa¬ 
tions taken at uniformly spaced time intervals from an Ornstein-Uhlenbeck 
process. 

In this section we shall find the distribution of smallest and largest values in 
samples Xi , Xi, * * • , a:„ drawn from a random process under assumption A and 
specified by the bivariate d.f. Fj(x, y) and the associated one dimensional 
marginal d.f. Fi(.x). Wc first prove Theorem I. 

Theorem I. Under asmnvplicm A, the dislribulion of largest values in samples of 
size n is gii'cn by the d.f. Gn*'(a:) = [Pa(a;, a:)] /[Fi(x)] 

To prove this result we note that Gn“(x), the probability that the largest 
value in samples of size n is <x, is given by 

(3) “=■ Prob (Xl < X, Xz < X, ■ ■ • , Xn < x). 

To evaluate the right-hand side of (3) we proceed as follows: 

(4) Prob (xi < X, Xi < X, •••, x„ < x) <=> 

Prob (a:j < X, sj < x. • • • , a:«-i < *) Prob (Xn < a: j xi < x, • • ■ , < x). 

But under assumption A, (4) becomes 
(6) Prob (xi < X, Xa < X, • • • , x„ x) = 

Prob (xi < X. xi < X, • • • , x„_i < x) Prob (x„ < x 1 x„_i < x) 
or 

( 5 ') (?n”(x) => ffL-iCx) Prob (x„ < X 1 x„_i < x). 

But according to assumption A, and (1) and (2) 

(6) Prob (Xn < X 1 x„_i < x) « Prob (x„_i ^ x, x„ < x)/Prob (x,i_i < x) 

» Fj(x, x)/Fi(x). 

Oi'^x) - Gn-i(x) FaCx, x)/Fi(x) 

«Gi‘»(x)(Fa(x,x))"-V(Fi(^))""' 

= (F,(x,x))"-V(Fi(x))"-l 


Therefore 

(7) 



592 


ni-WAMW 


lliiB ptovcfi Theorem I. 

For n « 1,2, and 3 r(wfw*cliv(‘ly ttne gels 

(8) (?,"'(^) - Fiix), Oi^'fx) - Wx, xjf/FUx). 

Theorem II. Umier amumptum. A,lfif diftnhuiiim oj mnlkd raitm in mmpki 
of menu giom hy the d.f. 

(9) « 1 - - ’ ■ 

To prove thia result wc first note that //I'Hxf, the probability tlmt the emallMt 
value in aamplea of siae n be <x is given by, 

1 -- Prob {Xi > X, Xi > Xt • ■ ■ i > x). 

To evaluate ES,‘’(*) we proceed as follows; 

(10) Prob (ail > X, £j > X, • • •, z, > j) ” 

Prob (xi > X, Xj > X, •' • , x«_i > x) Prob (x„ > x j xi > x, • • ■ , x»»i > x). 
But under assumption A, (10) becomes 

(11) Prob (xi > X, Xj > X, ' • • , X, > x) « 

Prob (zi > X, Xj > X, • • • , x,-i > xl Prob (x, > z ] x,-> > x). 

But 

(12) Prob (»« > »i x*_i > x) » Prob (x,.,i > x, x« > i)/Prob (x«^i > x). 
To evaluate Prob (i„-t > x, x„ > x) we note that 

(13) Prob (»„-i > X, x„ > x) ■+■ Prob (x,_i < x, x, > x) 

+ Prob (x»-i > X, x« S -c) + Prob Cx«„j < x, x« x) « 1. 

Also 

(14) Prob (x«_i ^ X, x„ > x) + Prob {x*„j < x, x« £ x) 

» Prob (x.-i ^ x), 
and 

(16) Prob (xvi > X, x« S x) + Prob (x,-, x, x* S x) 

=■ Prob (x, S *). 

Recalling that 

(1®) F 3 (x, x) « Prob {x»-i < X, X, < x) 


and 

(17) 


Fj(x) « Prob (x„_i 5 x) » Prob (x„ ^ x) 



EXTREME VAIEES 


59a 


we get 

(18) Prob (x^i > x, a:„ > s) = 1 - 2Fi(a:) + Fj(i, x). 

Therefore (10) becomes 


(19) Prob (xi > X, xa > x, • • ■ , x„_i > x, x„ > x) = 

Prob (xi > X, Xa > X. • • • , x„_i > x)[l - 2Pi(x) + Fj(x, x)l/(l - Fi(x))t 
Applying the recursion formula (19) successively we obtain 


(20) Prob (xi > X, Xa > X, • • • , x„ > x) = 

Prob (X, > x)ll - 2Fi(x) + Fa(x, x)]’^Vll - PUx)]""' 


- [1 - 2 Fa(x) + Fa(x. x)]"-'/!! - Fa(x)]"-’. 

Therefore Hh\x), the probability that the smallest value in samples of size n 
is <x, is given by; 


( 21 ) 


7/“’(x) = 1 


[1 - 2Fx(x) + Fa(x, x)]"-^ 
[1 - F,(x)]->-» 


This completes the proof of Theorem II. 

In particular forn «= 1, 2, and 3 respectively the d.f.’s of the smallest value in 
samples of size n are given by: 

H{”(x) « Fv(x), H5 "(x) - 2Fi(x) - F,(x, x), 

( 22 ) 

3. Distribution of the second largest and second smallest values in samples 
of size n drawn at equal^ spaced time intervals from a stationary Markoff 
process. Under assumption A of Section II we can state the following theorein. 

THBOHiat III. Und$r oisumpHon A iM dislnbutwri, of second largest values in 
samples of size n,n> 2, is given by ihe d.f. G‘’\x), 

Gi”(x) - {F,(x, x)r^mx)r^ 

+ 2[F»(x, x)]"”’(Fi(x) - Fi(x, x))/[Fi(x)]""' 

+ (n - 2) lF»(x, x)r’ {Fi(x) - Fj(x, x))V[i*’i(a:)]""*(l “ 

To prove this result we first note that On\x), the probability that the second 
largest value is < x, is given by 

(?«’(x) " Prob (xj. :S X, Xj S af. ^ ») 

4- Prob (xi > X, xi < X, X, ^ X, • • • , x„ < x) 

4- Prob (xi ::g X, xj > X, x» < X, X4 < X, • • •, X, < x) 4 • ■ 

4- Prob (xi X, X, < X, • • • , x„-» < x, x«-i > x, x„ < x) 

+ Prob (Xi < X, Xi < X, ■ ’ • , Xn-l < X, X, > x). 


(23) 





BBJfJAJrilN ■EKtm 


According to Theorem I 

(24) Prob (*i < X, ;rt < -r, • * ■ , < J) " IPjCx, x)r"VfF,rz)r I 

It can readily he shown that 

Prob (*i > X, xj ^ z, Xj < X, • • • , X, < X) 

(26) * Prob (xi < X, X} ^ X, ' • ■, x« , < X, x« > x) 

“l^\(x,xOrMF,{xl-f,(x,x)l/lF,(x)r ^ 

It can alHO be sliown that, each of the remainitig in - 21 terms on the right-hand 
side of (23) is equal to 

(26) x)r (F,(x) ~ F,(x, x)iV(Fi(x)r *(I “ F,(x)). 

Combining (23), (24), (25), and (26) we get the dwirwi reault in Theorem 
III, i.e., 

0?{x) « (A(x, x)]"“VfFi(x)r'* 

(27) + 2(F,(x, x)r^ mx) - F,(x, x) i /(F,(x)r * 

+ (n - 2)lF,(x, x)]"-'* (F,{x) - Fs(x, x)!V(Fi(x))'-'*(l Fi(x)). 


In a similar way one can prove Theorem IV. 

Thborem IV. Under asmmpiwn A, the dklrihulm nj ttrrond maUat tfoJucs in 
samples of smn,n>2, is giuen by the d.f. //l*'(x), 


» 1 » [i - 2Fi(x)4-F^(x.x)]"''' 
" ^ ^ [I - F,(x)l--* 


(28) 


„(l - 2Fi(x) -hFi(x,x)r"* 

(Fi(x) “• Fi(x,x)! 


(n-2) 


[1 ~ 

[1 ~ 2F,(x) + Fi{x,x)r^ (F,(x) 
(1 - Fi(x)l'»^» 


REFEIRENCES 

[1] J. L. DooB, “The brownian movement and etochastic eqnatioM,” Annalt of Moth«- 

malici, Vol, 43 (1042), pp, 351. 

[2) M. 0, Wano and G. E. Uhi,bnbi!ck, "On the theory of the hrowninii motion II," Rmiem 

of Modem Phyiios, Vol. 17 (1W5), p. 323. 



NOTES 


This section is devokd to brief research and erpository artides and other short item. 


KOTE ON THE CONSISTENCY OF THE MAXIMUM LIKELIHOOD 

ESTIMATE! 

Bt Amu-BCAM Wald 
Columbia University 

1> Introduction. Tlio problem of consistency of tbe maximum likelihood 
estimate has been treated in the literature by several authors (see, for example 
Doob [if and Cramer [2]®). The purpose of this note is to give another pioof of the 
consistency of the maximum likelihood estimate which may be of interest because 
of its relative simplicity and because of the easy verifiability of the underlying 
assumptions. Tlie present proof has some common features with that given by 
Dooh, insofar that, both proofs make no differentiability assumptions (thus, not 
even the existence of the likeliliood equation is postulated) and both are based 
on the strong law of large numbers and an inequahty involving the log of a 
random variable. The assumptions in the present note are stronger in some 
rospocta than those mode by Doob, but also the results obtained here are stronger. 
For the mkc of Bitnplicity, the author did not attempt to give the most general 
results or to weaken the underlying assumptions as much as possible. Remarks 
on possible generalitations are mode in Section 4. 

Let Xi, Wj, • •', etc, be independently and identically distributed chance 
variables. The most frequently considered case in the literature is that where 
the common distribution is known, except for the values of a finite number of 

* The author wishes to thank J, L. Doob for several oomments and Buggestions he made 
in connection with this note. 

• According to a communication from Doob, hie Theorem 4 ie incorrect, but is correct if 
the olasa of almost everywhere eontinuoua functions in that theorem is replaced by asuitable 
class C of functiona. The class C can be any one of a variety of classes; for example, the class 
of bounded almost overywUoro continuous functions, or the larger clase of almoet every- 
wham I’otilinuouB functions each of which is less than or equal in modulus to any one of a 
proscribed sequetiea of functions with finite oxpootations, His Theorem 6 on the consistency 
of the masiimim likelihood is then dependent on the class G used in Theorem 4. 

>Tho proof given by (Jramdr [21, pro 600-604, establishes the oonsistonoy of some root 
of Iho likeliliood etpiation but not iipcossarily that of the maximum likelihood estimate 
when the likelihood equation has several roots. Recently, Huaurbazar [31 showed that 
under certain regularity conditions the likelihood equation has at most one consistent 
solution and that the likelihood function has a relative maximum for such a solution. 
Since there may be several solutions for which the likelihood function has relative maxima, 
Cramdr's and Iluaurliasar'a results taken together still do not imply that a solution of the 
likelihood equation which makes the likelihood function an absolute maximum is necessarily 
consistent. 


696 



ABRAIUM WAT.U 


5M 

pammefcera, 6\ • • •, In. this note we shall tn*at the paminetric case. For 

any parameter point $ » , ^), let Fix, S) denote the oomaponding 

cumulative distribution function of Jf, ; i.e,, F(x, B) =» prob. (Jiff < *), The 
totality £1 of all possible parameter points is called the parameter space. Thus, 
the parameter space £1 is a subset of the A-dimcnaional (’artesian space. 

It is assumed in this note that for any $, the cumulative distribution function 
F{x, B) admits an elementary probability law fix, B). If Fix, B) i« absolutely 
continuous, /(x, 6) denotes the density at x. If F(x, 6) is discrete, f(x, B) is equal 
to tlie probability that » x. 

Tliroughout this note the following assumptions will be made. 

Absumption 1. Fix, B) ia dlher diacrele for all 6 or ia atmlnldy continuom 
for aU B. 

Before formulating the next assumption, wo shall introduce the following 
notations: for any B and for any positive value p let/(x, B, p) Im the mipremum of 
fix, B') with respect to B' when j | 5 p. For any positive r, let p(x, r) 

be the supremum of fix, B) with respect to B when | P j > r. Furthermore, let 
f*ix, B, p) » fix, B, p) when fix, 8, p) > 1, and otherwise. Similarly, let 
<P*ix, r) ■= ipix, r) when ^(x, r) > 1, and »1 otherwise. 

Assumption 2. For sufficimlXy maU p and for suM^imtly larger r the expected 

£ 00 A oo 

log f*ix, 6, p) dFix, Oti) and I log ¥»*(x, r) dFix, Bo) are finite where 

so J -m 

Bo dmotea the true parameter point.^ 

Assumption 3. // lim Of « 8, then lim/(x, Bt) ® fix, 8) for all x except perhape 

<■»« I'Mt 

on a ael which may depend on the limit point 6 iind not on the aequmce Bi) and 
whose probcMity measure is zero according to the probability duiribulion corre¬ 
sponding to the true parameter point Bo. 

Assumption 4. If Bi is a parameter point different from the true parameter point 
Bo, then Fix, 0i) Fii^, Bo) for at kasi one vdue of x. 

Assumption 6.1/ Urn | 1 « «, thm lim/(x, 6{) « 0 for any x except perhaps 

fa*** 

on a fixed set iindependeni of the sequence Bi) whose probabilUy is zero according 
to the true parameter point Bo. 

Assumption 6. For the true parameter point Bt we have 

f I log/(x, Bo) I dFix, Bo) < fo. 

Assumption 7, Tlw parameter space £1 is a closed subset of the kfiimmsiondl 
Cartesian space. 

Assumption 8, fix, B, p) is a measurable function of x for any 8 and p, 

It is of interest to note that if we forbid the dependence of the exceptional set 
on ^ in Assumption 3, Assumption 8 is a consequence of Assumption 8, as can 
easily be verified. 

‘ The measurability of the functioua /*(®, S, p) and o>*(x, r) for any 9, p and r follows 
easily from Assumption 8. 



M\XIMTJM LIKELIHOOD ESTIMATE 


597 


In tlie discrete case, Assumption 8 is unnecessary. In fact, we may replace 
fix, B, p) everywhere by fix, B, p) where fix, B, p) = fix, B, p) when fix, Bf) > 0, 
and fix, e, p) = 1 when fix, Bf) = 0 Here fio denotes the true parameter pomt. 
Since/( k, 0o) > 0 only for countably many values of x,fix, B, p) is obviously a 
measurable function of a:. 

In the absolutely continuous case, Fix, B) does not determine/(r, 6) uniquely. 
If Assumptions 3, 5 and 8 hold for one choice of fix, B), they do not necessarily 
hold for another choice of fix, 6). This is in a way undesirable, but assumptions 
of such nature are unavoidable if we want to insure the consistency of the 
maximum likelihood estimate. It is, however, possible to formulate assumptions 
which remain valid for all possible choices of fix, 6) and which insure the con¬ 
sistency of the maximum likelihood estimate for a particular choice of fix, B) 
In this connection the following remark due to Doob is of interest. Let Assump¬ 
tions 3' and 5' be the same as 3 and 5, respectively, except that the exceptional 
set is permitted to depend on the sequence B,. If 3' and 5' hold for one choice of 
fix, B), they also hold for any other choice. Doob has shown that Assumptions 3' 
and 5'" insure the existence of a choice affix, B) for which Assumptions 3, 5 and 8 
hold. Thus, one may say that Assumptions 3' and 5' are the essential ones and 
the stronger assumptions 3, 5 and 8 arc needed merely to exclude a “bad” 
choice of fix, 6). 


2. Some lenunas. In this section we shall prove some lemmas which will bo 
used in the next section to obtain the main theorems. Let Bo be the true parameter 
point. By the expected value Eu of any chance variable u we shall mean the 
expected value determined under the assumption that Bo is the true parameter 
point For any chance variable u, u’ will denote the chance vanable ivhich is 
equal to u when u > 0 and equal to zero othenviso. Similarly, for any chance 
variable u, the symbol u" wiU be used to denote the chance variable which is 
equal to u when u < 0 and equal to zero otherwise. We shaU say that the expected 
value of u exists if Eu' < ^. If the expected value of v! is finite but that of u 
is not, we shall say that the expected value of w is equal to - ». 

Lemma 1. For any B 7 ^ Bo vie have 

E log fix, B) < E log fix, 60) 

where X is a chance variable with the distrxbuUon Fi%, Bo). 

Prooe. It follows from Assumption 2 that the expected values m (1) exist. 

Because of Assumption 6, we have 


( 2 ) Bllog/(X, 5o)l < “• 

If E log/(X, 6) ^ , Lemma 1 obviously holds, Thus, we 

the case when E \agfiX, 6) > — =0 • Then 


shall merely consider 


(3) 

Let u = log/(X, B) 


E\\ogfiX, 9)1 < “. 

- log fix, Bo)f Clearly, E1 m 1 < «• B i 


known that for 



ABltAHAM WaU> 


m 

any channe variable u which ia not equal to a coiifetant (with probability one) 
and for which | u j < «, we have® 

(4) Eu < ln« Er. 

Since in our case 

(5) AV :5 1, 

and since u differs from ssero on a set of positive proimbility (duo t(» .iwnmption 
4), we obtain from (4) 

( 6 ) < 0 . 

Thus, Ijemma 1 is proved. 

We shall now prove the following lemma. 

Lemxia 2. lim E Iog/(X, B,p) log/(.Y, 6). 

PftooF. 9, p) /(x, fl, p) when /(x, ff, p) s5 I, and «1 othenvise, 

Similarly, hif*{x, B) ® /(x, 6) when/(x, 5) a 1, and otherww. It follow 
from Assumption 3 that 

(7) lim log/*(x, p) = iog/*{x, fl) 

except iwrliaps on a act whose probability incusurc is zero. Sinre Iok/*(x, 8, p) 
is an increasing function of p, it follows from (7) and Awiimplion 2 tluit 

(8) lim E log/*(.Y. 0, p) - E log /nA'. 6). 

Lot/**(x, 9, p) « /(x, 0, p) whcn/(x, 9, p) S 1, and I otherwi«>. Similarly, let 
f**{x, 9) « J(x, fl) when/(x, 9) ^ I, and othenviw*. (‘IcHrly, 

(9) |logr*(x, fl,p)| g!loK/**(Ae)l 

and 

(10) lim log /**(x, 0, p) » Iog/**(x, 0) 

for all X except perhaps on a set whose probability measure is isero. The relation 

(11) lim E log f**{X, 5, p) - E log 9) 

follows from (9) and (10) in both cases, when B \ogf**iX, 9) is finite and when 
E logf**(X, 9) = — CO, Lemma 2 is an immediate consequence of (8) and (11). 
Lemma 3. 2'he equation 

(12) limElog^CX.r) a 

r-»«p 

holds. 


‘ It 18 of no oonsoquonoe what value is aaaigned to v when /(x, 8) or/(x, Pa) ia aero, sinoo 
tlie probability of such an event, because of (3), is xero. 

"This is a gonoraliaation of the inequality between geometric and arithmetic means. 
See, for example, Hahdy, Littokwood, Pouya, Inequatilm, Cambridge 1031, p. 137, The¬ 
orem 184, 



MAXIMUM LIKELIHOOD ESTIMATE 


599 


Pkouf. It follows from Assumption 5 that 
(13) liin log r) = - 00 

r*«oo 

for any x (except perhaps on a set of probability 0). Since according to Assump¬ 
tion 2, 

(M) /iloK¥>*(A, r) < «, 

and since log .p(x, r) - log r) and log(p*(a:, r) are decreasing functions of 
r, Lemma 3 follows esiflily from (13). 


3. The main theorems. We shall now prove the following theorems. 
Theohem 1. Let to he any closed subset of the parameter space P which does not 
contain the true parameter point 6o. Then 


(15) 


prob. 


Hup fix,, o)fiXi , e). •. /(x„, e) 

■7(Xrrflo)/(^^) • ■ • f{X„, 6,) 



= 1 . 


Proof. Let ro be a positive number chosen such that 


(16) £; log vs(A'', ro) < E logfiX, So). 

The existence of such a positive number follows from Lemma 3. Let u, be the 
subset of « consisting of all points 0 of w for which \6 \ g ro. With each point 0 
in wi wc associate a positive value pi such that 

(17) log /(A', B, Pi) < E log fix, 9i). 

The existence of such a p« follows from Ivcramas 1 and 2. Since the set u, is 
compact, there exists a finite number of points 9,, ■ ■, dh In w, such that 
S(fli) PS|) + • • • + JS'(^^ , p«*) contains ui as a subset. Here Sid, p) denotes the 
sphere w'ith center 0 and radius p. Clearly, 

k 

0 g Sup fix, ,0) - • • fixn , 9) ^ ) 9i, Pi,) • •' /(x„, 9,, pi,) 

040 

+ ¥>(xi, ro) • • • (p(xn, ro). 


liciu’c, Theorem I is proved if 


(18) 



fix, ,0,, pi,) 
' fiXiJi) 


wo can show that 

• • ■ fiXn I 9 < I Plj) 

L../(x;,<?o) 



1, ■ ‘ , /l) 


and 

(19) 




(piXi, To) ’ • • y(Xn) rp) _ 


\l™ /(XiTeo) ^' • /(A^n, So) 


= o> = 1. 



Afia.iHASit WAI4> 


The atwvp wiuatioiw ran f*p writlf*-!! m 

proh<lim X v.- -* i 

) I ei'*»W a**! ^ 

C* U ■'•,h) 

and 

(21) probilim £ flt»K t%( — k«t/(A'*.fti)! *" ™ 1. 

{«NM1 j 

These «iuation« follow immodiaUdy frt>m (Ifi), (17) and the atreng law of large 
numbers. ThiB (’oinplete the proof of Theorem 1. 

Thbokem 2 . Let laC*!, • ■ • , x„} he a fmcHm of fAe eABerwifwOT , •••,»* 
that 

(22) ^ c > 0/or alt n and fur off j, , > • • , x„. 

/(*i» 8«) * ■ ■ /(x*, ff«) 

T/ie« 

(23) prob «« 5*1 « 1, 


pROo?. It is sufficient to prove that for any « > 0 the probability is one that all 
limit points 9 of the scsquenco |^bI satMy the inequality j ^ — 8«! ^ «. llie 
event that there exials a limit point S of theaetiuenee 1^«) such that | ^ - fij | > « 

implies that Sup/(ri ,5) • /(*», 9} S /(xi ,1,) • • • f{x„ , h) for infinitely 

l»--«eU* 
many n. But thou 


(24) 


Sup f{xi,9) 

'(seT^ 


/(x„ fi) 


/(x«, W 


a c > 0 


for infinitely many n. Since, according to Theorem 1, this is an event with 
probability zero, we have shown that the probabiDty is one that all limit points 
9 of (§„} satisfy the inequality | S — | S *. Tliia completes the proof of 

Theorem 2. 

Since a maximum likelihood estimate §„(xi , • ■ • , x*), if it exists, obviously 
satisfies (22) with c »» 1, Theorem 2 establishes the conaistency of ^„(xi ,•••,»») 
as an estimate of 9, 


4. Remarks on possible generalizations. The method given in this note can be 
extended to ^tablisli the consistency of the maximum likelihood Mtimat^ for 
certain types of dependent chance variables for which the strong law of large 
numbers remains v^d. 

The assumption that the parameter space ft is a subset of a finite dunmisional 
Cartesian space is unnecessarily restrictive. Let ft be any abstract space, AH of 



ON 'Wald’s proof of consistency 


601 


our results can easily be shown to remain valid if Assumptions 3, 5 and 7 are 
replaced by the following one; ’ 

Assumption 9. Il is possible lo introduce a distance e ,) in the space fi such 
that the following four conditions hold'. 

(i) The distance a(0i, df) makes Clio a metric space 

~ ^ ^ except perhaps on a set which 

may depend on 6 (but not on the sequence Si) and whose probability measure is zero 
according lo the probability distribution corresponding to the true parameter point So. 

(iii) If 00 is a fixed point in 9 and lim 5(0i, 6o) = then lim/fx, B,) =; 0 

1 —« 

for any x. 

(iv) Any closed and bounded subset of ft is compact. 

heferences 

[11 J. L. Dqob, “Prolmbility and atatistics,” Trans Amer Math. Soc., Vol. 36 (1934). 

[2] H, Cham&R, Mathematical Methods of Slatistics, Prmoeton University Press, Princeton, 

1M6. 

[3] V. S. HuzoanAZAR, “The likelihood equation, consistency and the maxima of the likeli¬ 

hood function,” Annals of Eugenics, Vol. 14 (1948). 


ON WALD’S PROOF OF THE CONSISTENCY OP THE MAXIMUM 
LIKELIHOOD ESTIMATE 

By J. WoLFOwm 

Columbia University 

This note is ■written by way of comment on the pretty and ingenious proof of 
the consistency of the maximum likelihood estimate which is due to Wald and is 
printed in the present issue of the Annals. The notation of this paper of Wald’s 
will henceforth be assumed unless the contrary is specified. 

The consistency of the maximum likelihood estimate is a “weak” rather than 
a “strong” property, in the technical meaning which these words have in the 
theory of probability, i.e., it is a property of distribution functions rather than of 
infinite sequences of observations. Prof. Wald actually proves strong convergence, 
which is more than consistency. His proof uses the strong law of large numbers, 
and he remarks that his method “can be extended to establish i'onsistenc 5 ' of the 
maximum likelihood estimates for certain types of dependent chance vaiiables 
for ■which the strong la'W of large numbers remains valid.” Belo'w we shall use 
Wald’s lemmas to give a proof of consistency which employs only the weak la'W 
of large numbers. Not only does this proof have the advantage of being expedi¬ 
tious, but it can be extended to a larger class of dependent chance variables. ^ 
The consistency of the maximum likelihood estimate follows from the following 
Theorem. Let p and e be given, arbitrarily small, positive numbers. Let 8 { 8 s, n) 
be the open sphere unth center 0o and radius p, and let ft(i?) = ft — S(9o, p). Lei 



«12 


i -mm 


iroM** 1 H Mu! Ik^ff triwl* -n pj andattolkr 

pmitm numhfr ,V(^, *! wA n > A"'?}.»>, 



if 




II A,, #!(■'] 

i 


> 4" '- r e 


M,#« /’s w to fffdmhhi^ i*J to ntUkm »«r/>/,'i, 
pHiwr: Pnriwl mcllly » ia tiw 4 %'«!»?'»lltmirsii I mA oSrtam 
wt * • • ■ I Wi,,»tb»l the M thfsftrrif*' stim «4 ife' <i}*fri Si$,, ^ J, j « i 

2,,hf eovm tlw rtsmpR41 wi mh>irh s» «hr- iii{rr'w’iriH?!R (jf ^th |}jg 
itphare i if J < r« . I Wmf Hi,?, ? *> L ■ . I. 


“■2f(§,s « f P«/ i^.V, • H h%fX, 

(t » 1, 


""‘iftfiiM'! » - K St-n/iA", ip) 

If any of tlip riftht m«mk-w akni* iirt- tiirina!i* ki rfi„» k* rdir, wy. 'ITius all 
T(9i) arr* [Kwilive. Applying llif) wrrak law cif lafgi’ nnndi'ii’fit wr hav<* llial, for any 
lauchthftt 1 ^ I A-f- I,llir'rf'»*xtftkJHKvfts!nrn«(f*i!»n A\mjoIi lhat,wlipua> 

Ni, 

I II/fA’,,1,,I' 

1% / J ^ „ nflMtift > t * 

Ita.w i '■ + ’ 

\ 1 

(i " 1, *•• , A) 

[ II fiXu rj ] 

„L., . > i-xp f-'Sfj’fiinir/ 

II/ 0 '.,«.> *' + ' 

\ I 

From this the theorem follows immwimidy, with 

*) mas Ah 

i 

h(t}) « max exp 1-T(i,l|, 

i 

The author is obliged to Prof. Wald for bis kmiJne* tn making his paper 
available to the author. 



RANDOM WALK 


603 


A NOTE ON RANDOM WALK 
By Hkuheut T. David 

fhr Jahm Hopkins VnivrTniy InslituLn for Cooperalive Research 

A random walk is didiiiifl as a serii's of discrete steps along the real line, here 
denoted by /. k’a'di Ktep ia reimwntwl by the chance variable X, with sectionally 
continuous fUnisity rum-tion fU). The walk begins at any point a of I, and 
continues until a step rarrit*^ us outKide some subregion fl of I. In this note. [2 is 
taken m a finite interval with upper bound D and lower bound D - y. The 
chance varialdes .V and X are* reapeetively, the number of steps required to end 
the walk, and the eiidpmnt of the walk. The range of Z always excludes fi 
Below' we define x -- J> - a, and consider E{N) as a function Q[x, y) of x 
and y TTuler Kpeeifufl fouditmns, a differential equation (32) is derived, relating 

G(0, y) ai'd 2^’- 
Ut 


(0 

f i(0 “ HI - o) 

iJJ) f ••• (n ” 1) ••• J n/fud 

(2) 

/ - « - £ dyi • • • dg„.i ; 

where 

4' Z g.] i. 2, ■ • •, n ~ 

Then 

/'lZ<ufi,iV « n) » / MOdl 


® rt| “ 0 

Hence 

i'|,V « «J f f«(() dl 

(3) 

i,f UOdl. 


n > I 


for icie 
for Witil. 


The trawfortnatioii |Ai + iCt-u £f/»t ’• I* 
the more convptiit’nl exprttwion 


n - l] gives for ^„(0 


r 

Ju 


(n — II 


fjih, ” a) 


- hn-i) dhi--- dk-i. 


(4) 





mi 


H^Knr.Ki T, iiwr< 


Thr* inU*gral j fjit di alw?:JiiU'!y oin^ t*rgfnl., henw may lie inte- 
grat«l first Hifh resprrt !o t. Tlsift givfs, kf-rping; Sh»‘ nutafinn of (41 

( 5 ) / * f t'tfW, 1 . 

.\W4Uffling that E(N \ mnaijw fmitr* for bH ctmfsidrrwl«jursl it, seriw (3) nmy be 

«9‘ 

ttfaiTangwi, giving : ®(.V| =« 21 ih whw 



Now, £ PIN lUS 1 1 « 1. AIiw, tming ih) mil miluHkm m n, it k readily 

l-»l 

gLho^vn that dL so lliat 

Ja 

(fl) E{N) ° 1 + £ 

<«•! 

Define transformations ; ({f, » ~ A<, t: 1, * ■ • , n ■“ 1; gf*. “ D l\. 
Substituting cxprcfisiona (1) and (4) m (0), transform the jfth term of the sum* 
mation by Tj . This gives 

(7) E{N) ^1+ T, I ••• (n) *•• [fix - gi) lJ/(g, - gi+i) dfft 
where x « D ~ a. 

By (7), £{JV) is a function of x and y; hence we write E(N) « f«’(x, y). 
Define: 

M(/c);Max/(Ofor|t! k. 

K : Any number satisfying it < (1 -• t]/M{K), 

R : Any region (— «= <x< «:0<y</r]. 

M :Max/(0. 

L : Any number satisfying L {I — t]/M. 

W : Any region [— oo < x < »; 0 y < //]. 

In the ensuing argument, wo shall assume that 

(8) (*, V) i R. 

This condition restricts certain one-dimensional and two-dimentional variables 
to regions over which some infinite series are uniformly oonv.ergcnt withr^peot 
to these variables. Uniform convergence is retiuired to validate term«by*tenn 
differentiations and integrations, and to establiali the continuity in one or two 
variables of certain functions represented by scries. 

Arguments dealing with the solution of integral equations (17), (20) and (25) 
are valid only under the more restrictive condition 




RANDOM WALK 


605 


( 9 ) y) « R' 

this beinB the general sufficiency condition for the existence of solutions. How¬ 
ever, (17) and (20) enter tlie argument with respect only to the derivation of 
equation (21) which could have, been derived, though in a more cumbersome 
manner, by a term by terra comparison of the series expressions for [Ui{x, y)] 
[G{y, 2/)l latter approach being valid under (8) 

Similarly, (25) i« ™ obtaining (27), which could have been obtained 

by a direct manipulation of the series expreasion for G(s, y), this approach also 
being valid under (H). Henci*, all 8ul>Requent derivations hold, as long as (a:, y) e R 
By (8), we may interchange summation and integration with respect to pi 
in (7). This gives 

QO) b'(:r, y) “ 1 + I /(x - g)G{g, y) dg. 


(11) Assume that/(f) baa a continuous derivative everywhere 
Then/(f) in continuous and <7(x, y) is continuous by (7) and (8). Hence 

(12) /(* - y) d/dxfiz ~ g)Gig, y) are continuous in (x, g) 

^ 3 ^ f(x ~ g)(Hg, y) is continuous in (p, y). 

Let Go(x, y) fh-note 

dx' dy’ 

Then by (12», we may differentiate (10) with respect to x, and, since 
Uix - ’g) » "“/‘-dx - gl intiitralton by parts yields 

( 14 ) (?,b(x, y) - Ax)r;(u, y) -fix- uMy, v) + | /(* - ^3- 

Further under («), (mix, y) may be obtained by differentiating (7) term by 
tem tTi. Cmlinuou, i„ (x. ,). ,) i. oontta»».m (,,,), 

and we may differentiate (10) willt resiwt to y. ^ving 

(15) y) — y)b'(y, y) + 

Adding (14) to (15), dividing by «t0, y) which is always greater or equal to 1., 
and letting 

( 10 ) X(x, y) « K7w(x. y) + y)]/0{Q, y) 


we obtain 

(17) X(x, y) “• fix) + ^ fix “ (f)X((?, y) dg- 



606 


HKRaKflT T. l>AVm 


Under (9), (17) defines a. function 

r* 


(IS) 


X(i, y) “ /(x) + 5 ^ 

¥|Ml] 


(n) 




gt) 




llfiot "■ gi4i) fiQm) dgi 




By (8), this function is continuous in (x, y) and may he difTcrentiated term by 
term with reaped to y. Further, X«i(x, j/) thus gotten is continuoua in (x, y), so 
that/Cx — v) is oontinuous in (g, y). Hemus, (17) may be differentiated 
with respect to y, giving 

(19) W*. y) “ /(x ~ y)k(y, y) + /ix - g)>>,iig, y) dg. 

Since, under (9), the integral eciuatiou 

(20) «Cx, v) =■ /(x — y) + fix - g)aig, y) dg 

has a unique continuous solution for every fixed y, (16) and (19) give 


( 21 ) 

Hence 

and 

( 22 ) 

(23) 


Xfli(f. V) „ ^o.i(x, y) 

X(y, y) Oly, y) ' 

Xni(x, y) dx Cmix, y) dx 

Uv, y) ^ (r{y, y) 


d 
dy Ji 


[ \(x, y) dx Y f 

Kv,y) ^"oiyJy) 


Let/(0 « fi-i). 

Then it is obvious from the definition that 
(24) GiO, V) “ G(y, y). 

Further, by (15), 

y) ^ /(., „ y) + JV(X ™ g) ^ dg 


( 26 ) 


Giv, y) 


so that, under (9), (26) gives for Gnix, y)/G{y, y) the unique expression 

“ pV pv n—I 

fix - y)+ • in) • - • / /(x - gi) JJfigi ~ g,+i)fign y) dgi • ■ ■ dg„ 

H-1 Jq Jq i-i 

which, by (23), is equal to 



RANDOM WALK 


607 


M rV fU n—1 

f(y - x) + j ■■■ (n) ■■■ I f{y — g„) - x)dgi--- dg„. 

n*-! •0 *^0 t“l 


to X, it follows that 


r Ooiix, y) _ 
io Giy, y) 

(26) 

/ f(y ~ x)dx + j 

Jo n-1 Jo 

■ 1 fiy - On) n/((7.+i 

Jo t-l 


1 ) 


, dx 


which, by a change of integration indices and a referral to (7), is seen to equal 
[Giv, y) - !]■ (26) thus gives 


(27) 


f Goi(x, y) dx = G(y, y)[G(y, y) - 1]. 

JQ 


Further, by (16), (24), and (27), 


(28) 

f \(x, y) dx = G(0, y) — 1 
Jo 

so that 


(29) 

— f X(i, y) dx = ~ 0(0, y) 
dy Jo dy 

while (24) and (27) also yield 

(30) 

f G(x, y) dx = [0(0, 2 /)]'. 
dy Jo 


Hence, by (22), (29), and (30), 


(31) 


X(j/, y) = } G(0, y)/G(0, y). 

dv 


Finally, substituting (31) in (21), and remembering the definition of X given in 
(16), we get, using (24), 

(32) G(0, y)[Gn{x, y) + Go,(x. y)] = | G(0, j/)[ft„(., y) + 2Gn{x, y)]. 

The conditions under which (32) holds are, in summary, (8), (11). and (23). 


If /(O has an expansion 

(33) 

it is clear from (7) that 

(34) 


lil<r 


G{x,y) = Z B,ix'y’ 
1 , 1—0 


for {X, 3 /) where S-.[f, < x < Tr^O < y < Tr+U To <0,T,< T. 



BO»ir,!tT K Uilf.rswtmj) 


MIS 

Satetituttog 04) in 02), a»i rquating rta-ffwients r*f iikn prwpre of (x, y), 

w& ol>taai tile wnrj'ion fornttike 

t3S) T /f.,/kWf2l;-J”4 ll- Z + Illi - ii; i:0, 1,.... 

From (10), if iH r»*ii{y vriitiol ilisf ll^ 0 fnr t irf 0, i.o that wjualionR (35) 
give BolulionB for the B,, in term# trf thr . Thr* sfjlntiorw are tif interest 
since they ahow n one-to-nne corrt5»iw»irleiit*p kqwwn the functions 0(0, y) 
and G(x, y), for (x, H .'‘Jj. 


NUMERICAL INTEGRATION FOR LINEAR SUMS OF EXPONENTIAL 

FUNCTIONS 


By Rohkht K. (laEKxworm 


The Unicrmly of Trias and thr Inttlifulr fur Nwnmetd Amlym^ 

1, Introduction, The methcKla nf numerical iutt^gratiou going by tlie names 
trapezoidal rule, Bimiwnn's rule, Weddle’s rule, and the NcHitm-Coli* formula© 
are of the type 

/I m 

fix) dx Z X„/(x,„ 1 

I 

where the almciasar* lx,„} are uniformly distributwi on a finite iiilen'al, chosen 
aa (“i, 1) for convenience, 


( 2 ) 


Xi« » —I + 


2* 


i « 0, I, 2, 


. n, 


and where the set of constants (Xinj depend on the name of the rule and the value 
of n but not on the function fix). Throughout this note all tibscksa© will be 
assumed to be uniformly distributed on (-1, 1) unless the contrary ia explicitly 
stated. 

Since correspondence relation (I) involves in + I) conatanta {X(„l, it might 
be possible to chooM (n + 1) arbitrary functions gjix), j » 0, 1, 2, • - • , n, 
and require that the set (Xvn} be the solution, if sucli oxials, of the in + 1) 
simultaneous linear equations 


gi{x)dx ^ YlKingiiXin), 

1 |w0 

Indeed, the selection 

( 4 ) Qiix) *= 


j » 0,1, 2, ■ “ •, 


i » 0,1, 2, > • ■ , ft, 


will give a set of (ft + 1) simultaneous equations of form (3) and the solution {X„) 
i s the set of Newton -Cotes weights for that value of n. The numerical evaluation 

* This work was performed with the financial support of the Office of Naval Research of 
the Navy Department. 



NUMERICA.L INTEQRATION 


609 


of (X.n) is liPst accomplished by other and more sophisticated methods how¬ 
ever. 

Because of linearity in both the integral and the finite summation, once the 
constants {Xui} have been determined for a specific set of functions {£ 7 j( 2 i)J 
correspondence relation ( 1 ) ia exact for any linear combination of that funda¬ 
mental set. Thus, for example, for the fundamental set ( 4 ), correspondence 
relation (1) with the appropriate values {X,„) is exact for all polynomials of 
degree less than or equal to n. 

Although tradition favors the set of functions (4), there is nothing compelling 
about such a selection. Indeed, two other possible choices might be 


(5) 

gAx) = e‘% 

J = 0 . 1 , 2 , •• 

■ ,« 

and 




(C) 

gA^) = 




j = —m, —m + 1 , • ■ , 0 , t, ■ • , m — 1 , m; 71 = 2 m. 

These choices would seem to be appropriate whenever numerical methods are 
being applied to exponential growth curves or exponential decay curves. 


2. Use of the basic set g){x) = If integration relation (1) be made exact 
for the set {e^‘] , j =■ 0, 1, ■ • • , n with evenly spaced x abscissae, the set (3) of 
(n -b 1) simultaneous linear equations in the unknowns {X,„), f = 0, 1, ■ ■ ,n 
is obtained. Call the solution ot this system |o,„), solution values for n = 1,2, 
3, 4, 5, Ci are tabulated below. 

For the symmetric case where integration relation (1) is made exact for 
s= ~m, ~m -+-1, ■ • • , w — 1, in; » = 2m, a similar but different set of 
linear equations (3) results for the unknowns {X,nl. Call the solution of this 
system 1 h,n} ■ As implied above, only even values of n are used in order to preserve 
the symmetry, and values of {b.„} are tabulated below for n = 2, 4, 6. 


n = 1 , 

floi — 

1 31303 

5285 




flu = 

0 68696 

4715 



n =» 2 , 

Ooi = 

0 21805 

032"^ 

bo 2 “ 0.32260 

623~ 

Oia *= 

1.49780 

742 

hii = 1 36478 

755 


dn = 

0.28414 

226~ 

622 “ 0.32260 

623" 

n ■= 3, 

Ooi » 

0.51324 

284 



CSlJ ”■ 

0.22446 

055 




Ctaa ^ 

1 08166 

527 




aaa “ 

0.18075 

134 



M = 4, 

flOd = 

dll “= 

-0.13716 

1.40098 

639+ 

548 

boi = 0.15048 
hu = 0.73243 

171 

318 


> Whittaker and EobinBon, The Calculus of Observalions, 4th Edition, (1946), London, 

pp. 162-166. 



610 ROBKIIT K. OHKENWOOD 


a-u “ 

- 0.30895 

914 

thi “ 0.23417 

022 

(In 

0.91710 

iK)3 

hu « 0.73243 

318 

SE 

o.imi 

103" 

bu 0.16018 

171 

0*6 “ 

0.68919 

3 



(Jli ** 

-1.07644 

3 



dfi s* 

2.12ri34 

6 



rt » 

-0.63695 

6 



«4% »■ 

0.79933 

8 



a$s =* 

0.09852 

18 



£44 

”0.83607 


bn 0.09143 

5 

Hie “ 

3.54128 


bn 0.5346-4 

7 

Oa “ 

-3.88102 


bn 0.01139 

3 

Om ““ 

3.32254 


bn -- O.TllKW 

0 

0(4 «» 

-0.94085 


/»« 0.01139 

3 

n-M “ 

0.72075 


ttu » 0.5346-4 

7 

Oh “ 

0.07937 

5*- 

bu -2 0.09443 

5 


Tlw computing service of llie InalimUs for Numcrirnl Antilysi# hi*« Bupplied tbo nuLlior 
with moat of the coefficionls Uiiulniefl olmvo. 

3. Eetimatea of the error terra. The choictw of the crHjffu’ionts (a,„) and 
{6f«) arc such that integration ndation (1) is exact whenever 

(7) fix) » ^0 + Aie* + Anp”’ and »- a,.,, 
and whenever 

(8) fix) “ • •• + 4- • ■ • + lime”* and X,« » bm , 

When/(x) is not of th(ffle prescribed forms, the error in using correspondence (1) 
may bo of some importance. By making the transformation 

(9) u =» e* fix) = /(log u) “ ffiu) 
integration relation (1) becomes 

(10) [ {?(u)^ c« £ 

U imt 

where the {iti„) are not evenly distributed. By approximating giu) by its Taylor’s 
series with a remainder term, the following expressions for the error in using 
correspondence (1) can be obtained: 

Using the coefficients (ofn), 

(11) Erior < [2 + 1:1 a,. |] [_i^_ (.- £)*' 

and, using the coefficients {6i«l, 



NUMKRICAli INTEQRA.TION 


611 


( 12 ) 


Error 


iy"^' 


(2m 4- 11 


/ r r - e-" ^|b.,,„n 




Neither of thew error expremions can be said to be very practical in actual 
computation, and neither appears suitable for establishing convergence proper¬ 
ties of the types 

H j* I 

lim ^ \iJixin) =■ / fix) dx. 


(13) 


410 


•^1 


However, both (11) and (12) reduce to zero when/(a:) is of the form prescribed 
by (7) or (8) reapoctively. 

4 . Numerical examples. As illustrative numerical examples, the case n = 4 
was'seleotefl and aoveral typical functions were integrated approximately by the 
positive power exponential rule, the symmetrical exponential rule and the 
Newton-Coles formula, 

[fix) dx M7fi-l) + 32/(-i) + 12/(0) + 32/(1) +7/(1)]. 

Values of |om 1 and \h,,] are Biven in the tables in part 2. The typical functions 
used were /, c**, l/(* + 3). e"*, xe% x\ and e The following results were 

obtainwi: 


Function 


l/(x +- 3) 

xe* 

x‘ 


Pmilive Power 
Exponienlial 


.6703 

3.6208 

.682B 

1.4930 

.7292 

.0270 

4.0527 


8827 

0044 

0353 

1390 

4338 

fN87 

7287 


fiymmlrical 

Exponential 

Nowlon-Coies 

8 Decimal 
Approximation to 
Exact Value 

.6671 8001 
3.0208 6041 
.0931 5792 
1.4867 2754 
.7363 6007 
.3238 5196 
4.0630 7585 

.6666 6666 
3.6317 3108 
.6931 7460 
1.4887 4582 
,7361 7480 
.3333 3332 
4.0607 7415 

.6666 6667 
3.6268 6041 
.6931 4718 
1.4936 4827 
.7367 5888 
.2857 1429 
4,0519 1379 


Fron, tl.i. .abulaii.,.1, it w.,ui.l »P1»” “ “ wW 

method comparcH favorably wutli Ibo of x'" or 

funotio„. M l/(x + 3), r-, Xf, «*, a»<l N.wtoi- 

is not really a fair choice when comparing th symmetrical exponen- 

Cotes is derived so as to give exactness for x^ and the symmetri 

tial so as to give exactness for e®*. 



m2 


iisrufH mnn 


SMOOTHEST APPROXIMATION FORMULAS 

By AtiTHt R Sarij’ 

Qu^4em CoUrgr 

IntroductiMi. Gotisder a pn^pw of approximation wliirh ofMjratEjs on a 
function X “ xii), TTm* error in the pnicw may be thought of a« a num R + M, 
where R is the error tlial would la? prroent if x were exact ami SA is the on-or due 
to etrow in x, (Precise defiaitiona are given Iwlow.] .Hujjpowt that one wishes to 
choose one process A from a claw Cf of proce«««. In uome Bitimtions it i« appro¬ 
priate to base tbe choir® on R alone®; in others it is appropriate to conaider SA. 

The primary purprw of the present note is to formulate a criterion of smewthest 
approximation? Tliat A in Cf is smoothest w’hieh minimises the variance of 
3A. A criterion basetl on both R, and iA is also «ugg(«ted. (Sections 1 and 2.) 
Smoothest approximate integration formulas of one tj'pe are derived in Section 3. 

Progiw m the technique of estimating the covariance function of the errors 
in X will lead to further applications of the criterion of smocithest approximation, 

1, Approximation of a functioruil. Suppose that AT is a Hjiace of functions 
X - xCO each of which ia continuous on a S t £ h. Ixit f\x] be a functional 
defined on A'; that i«, f[x\ is a real number definwl for each x * A'. For example, 
X might be the space of functions with second derivativea on \a, b) and /(x) might 
be x"(m), where w is a fixed numtor In (a, h]. 

Suppose thatjlx] is to be approximated by a Stieltjcs integral 

(1) A «“ f *(0 cfodO* X eX, 

where a is a function of bounded variation. The rmnainder in the approximation 
of /lx] by A is 

iJ “ A — /lx]. 

H the approximation (1) operates on * 4* instead of x, the mult is A + 5A «« 

/ (x + ffx) da-, and the error in the approximation of/(x] by A + 5A is 22 + 5A, 

J 

where 

(2) 6A w / Sx(l) da(0> 

Consider a class Cf of approximations A, each of the fom (X). We shall propose 
a criterion for characterizing the “smoothest A" in Cf, relative to the oovariantse 
function of the errors Sx. 


> The author grateCuIly aoknowledgee finanoial support rcoeived from the OHoe of Naval 
Research. 

’ "Beat approximate integration formulas; best approximation formulas,” Arwr. Jour, 
of Maih., Vol. 71 (1949), pp, 80-91. 





APPROXIMATION BOHMDLAB 


613 


ABSume that 5r « ii* a stochastic process with mean zero’ and covariance 
function Then, by (2), &A is a stochastic variable; and* 


(3) 


E f Sj* da f 0 da =“ 0, 

r t Bi»3- * 

jBfSTf t) « j^l^J^ArdldafO ^'jxfu) da(u)J« (rit,u) da(i) da(u). 


Oritbrjon. Thai -1 (if any) in Li i» moclhesL which minimms the variance 

V of 6A. 

In particular cawn*. thia criterion (least squares) has been proposed and use,d 
by Chebyshc'V anti tithcrs. An application to approximate integration is given in 
section 3 below. 

One may extend thia diwuasion to cases in which the approximations A 
involve ilerivativea of j. 

Remark. The criterion of bnl approximation’ may be combined with the 
above orib'rion of t/mmlhrri approximation as follows; That A (if any) in 3 is 
the best compromiM* which minimiaos a specified combination of the variance 
of 5/1 and the mmluluK of R. Here it in asaumed that the remainders R satisfy the 
conditions for the cxtS'tctuT of the modulus.’ 


2. Approximation of a function. One may extend the preceding discussion 
to the CHOC in which u ^ /U1 is an operation to a space of functions y = yiu)- 
n ^ h-, and in which the approximation of /N is 

,1 I x{l) df ait, u)i a: 6 

where, for each u, a i« a function of bounded variation in t Then, for each u, 
5 A has a variance t'(ii). Criterion. That A (if any) in a class of approxirnations is 
smoothest which minimiwjs e(u) for all u; failing such an A, that A (if any) is 
smoothest which minimiJies the integral of «(u), or alternatively, the supremum 

of r(u), over A 5 » s£ S- 


3. Smoothest approximate integration formulas in a particulm case.® Let m 
and n he fixed iiileKcrs; m £ 1, n S 0. Let Cf ® Cfn.n he the dass of a approxima- 
tioM of 

>Thc r«icniittl point here h lU«t OtH « »i(0 bo known for each i, for given m{l), one 

could nnrl would replace .r e Jr by /: P Jc -- nt. i„,„™inn 9 of E and fd« 

* Wi* ftiwumc here that the intcKnils In (31 exist and t nroduct measure au 

arc valid. For ihi* it i« wiftident that ir be Int^rablo rein tye underlying 

for all funetioiw fccurreapnnding to elements of (1 where w " ® , up obabilitv in 

probability apace relative to whieh E is the operator / dw. Cf. . . °o > 

function space," Butt. Anier. Math. Soe., Vol. ot^swh a nature that one 

> The approximate integraUoB formulas of this section a ,. 

would expect them to be known. The values of / at the end are probably new 



M4 


Afrrttri! (tA«o 


of (hf fvmi 


r 


Jt^i} rff - /|xj 


■m<S 


fk" m + I cmmktnffi ?», Mieh rtfll A « wVivw ti a fwl^nttmial 0 / 
ttffw «. 'rhrouRhflUt thi« Mjrlirm j is to rMRC ovf-r tUr* m -f I valuf-is i « -7rt/2, 
~m/2 + 1( • ■ • , + w/2- tSuppim ihM thr ttrarM feCs) au with 

a/mman, mriawM ami mlh mmn. stro. Th<<n «?!) »* a Bti’p fun«*tjon with jumpa 
bi at t • 4 ; and 


The anoQthest approsiraation in 0f„,, i.« tho on*- for vliich » 59 a miulmuin. 
(The in 4 -1 variabla* In in f are anbjert i« n i(' I rnriHlnunta due tti the condition 
that the approximation he exat;t for de^*e n. 'Hie sset tf«,, i» empty if and only 
if m ia l«se9 than the largeat even integer cjmtainwi in nj 
If n “• 0 or 1, the amcwthest fonnula in »« the one for which all the 
coeffidonts are equal: 

hi w/(»i d- l!; 


in which case 


e « n^g^fim + 1). 


If n » 2 or 3, the smoothcsit formula in C<„., i« characteri««i by the following 
relatione: 


(), la Xo 4“ t*Xi ( 

k = m(2m.* + &»!-“ 0)/2(m — l)(m 4' l)Cm 4* 3), 


X, a. —30m./(»n - l)(m 4-1)(« 4- 2)(nv 4* 3); 

in -which case 

y/ff* » XjTrt 4* XifnVi2. 

Thue, the smoothest approximation in or in tt*,j i« the following: 

A « i[a;(-3) 4- a!(3)] 4- fla:(“2) + *(2)) 4“ Hl*(-1) + *(1)I + f *(0)- 

By the method of Lagrange’s multipliers, one xojxy establish the following 
relations for the smoothest formula in (#«,». Here i baa the stune range of valuM 
as before; p and v range over 0,1, • ■ •, ln/2]. 

bi « Ex,A 

y/(r’ =* E^xC^i 


approximation tormulab 


615 


where 


^ m^+V4'‘(2M + 1), 

and X? are tletiTminotl by the equations 

X i 

The dm Ct,. ia aueh that for eaah A e 3«, there is a function m with the 
following proiM'rly. 


.mil 

/e « /I f[z] ® / a;'''+‘’(t)Jt(i) d£ 

A-w/j 


whenever x is a ftmelion with eontinuous (a + l)th derivative. The quantity 



ifl useful in appraising It, ginco 

(‘mil 

rt'SJ 

i~HlZ 


by Schwara’s inoqualily. 

Yftluw of J for the amoolheat formulas are as follows. 

Ti II ^ mV6(ra + 1). 

n, » 1 j J sa m*(8ni* + 2m + l)/360(m + 1). 

For n 2 and 8, and m ^ 6, the numerical values of J are as follows. 


m, J 

{n « 2). 

2 1/1,890 

3 11/8,900 

i 134/33,075 

5 1,865/150,528 

6 8/245 


J 

[n •• 3). 
1/9,072 
13/17,920 
62,539/13,891,500 
136,223/6,322,176 
6,683/82,320 


For the method of calculatioii of J, as well as the transformation of J under a 
linear transformation of f, the reader may consult the paper*, 



j4ffs' r wAtAR 


OH THE POWER FOHCTIOlf OF TtlE "BEST‘S MIST SOLUTIOH OF THE 
BEHREHS-FISHER PROBLEM 

By Johs K. WAt#a 

Tht Rttrtd Corptnmium 

1. Ifllro4ttcSwa. The BrtiWM-FWier prrA^rssa b nftnTm«l with aigtHfican^ 
l«jt« ff»f the tliffetrnw <i the rnmnn of two nwraal pfipokttona whMi the ratio 
trf the vadancw (4 the fwsptslfitkms is tmkttrmn. Ik-Roli* one iMrpulalion by 
vV(oi, ffi) and the rther by .V{*ts, «l), wbrte Bie notatkm .V(a, «s»®i rcpiT*«fnts a 
normal ptipulaticm with m*<an a and varianiY i^n m «inple valm«« bo drawn 
from W(oj, all and n sample valin« from .V((h, «?l) wliere m < n. Then Schf?ffg 
(1| hw Aown that rcrtnin optimum propprtkw are ptwwwl by a S-lmt sulution 
hepropf»«l for the Brhrenit-Fbhw pnddero, in which the nurorrator of i ia baaed 
on the differencp of the means of the mmpleR while the ilonomiiialor is baaed on 
the atiuare rord of a function rf the sample value? which low a x*‘dii!lribi!lian 
with m — 1 ilegrcea of freedom. Tlw purprw* td fhi« note b f rt comparf! the power 
funcUoti <i this t-te»l with the power funelbm of the corrr^fwnrling most powerful 
test for the caw? in which the ratio (jf variance*! ffl vl i* aliw» known (only tme- 
ftided and sj^mmetrical testa are ccmsidpretl'). This cnmpnrieon i« made by com¬ 
puting the power efFiciency (see mHion 2 for riefmilitm) t4 Hcheff^'a l<«t. 

It is miffident to limit p«wer eSideney invwtigatinnn l« one-Khled tests. As 
shown in (2), a symmetrieftl bt«t with Mgaifleanw level 2ot has the «uim power 
efBeiency as the corresponding one.flided f-test with Bipufteance level a. Etpmlion 
(2) of section 2 fumishra an explicit formula whereby approximate power effi¬ 
ciencies can be computed for a wide range o( values of a, m, n. Table 1 contains 
values of (2) for a - .05, ,01 and several values of m and n. 

For the situation considered here, a power efficiency of lOOr^^J; him the quantita¬ 
tive interpretation that the given test baaed on samplea of si» m and n ims 
approximately the same power function as the corre^mding moat powerful 
test based on samplt® of sise rm and m. Intuitively the power efficiency 
of a twt measures the percentage of available mformation per observation 
which is utilized by that test. 

2. Power efficiency dedvationa. The basic notion of the power efficient of 
a aigniRcance twt is given in [2], For the preaont ease the problem i« to determine 
the value r such that a most powerful (wt of the same hypothecs (wune «g- 
niflcance level) based on m and m sample values will have approximately the 
same power function as the given Lteat based on m and n sample values (from 
Af(ai, (Ti) and jV‘(«*, erj) respectively). Here the value of (r|/«rs k aMumed to be 
known. Then the power efficiency of the given l~tmt equals l(X)r%. 

If the ratio of variances a\/ff\ is known, the mc^t powerful mgniflcance test 
(onesided and symmetrical) for the difference of means of the two normal 
populations is a (-test where the numerator of (is based on the difference of the 



TABLE 1 

Perccnlage Pcmr Efficiencies for Certain Values of m and n 

a = .05 



617 

































JimN K. WAt^fW 


Bin 

i!»p «l»'is*#m!n«SAr m< rm rrwrt iif afunctioa 

of Ibewjmpli* valnw att*i •9i-<?! mhK'h hfi» a ’A'llh m + ?i 2de- 

gttvt of fttwlom 11, p 13} Thu« ibr* pnthlim i» tImS of comphring thf power 
funrti<'»n,» of Iwa I44'»to, 

A» in fiTtitai 1, »l t« wlirtml to cmmkr tmo-niiM li'vti' Wp find, i imiig 
a, modifienllon of tho nonnal apf»rt*siinAtK»n to thr fKwrr fniirtinn (4 a rmo-mded 
Wiwl givpti in i3|, Ihs^t St*twiff#“}» t®i» j^kirsd f-tw!. for tfsr* Ib-hrm^-FtsUpr problem 
and tW rom*i>nndini 5 powerful tmo-jsidrfl twd known j Iwvt* approjd- 

maU*Iy Ihn mmf powrr functjf«» wbm r b rhi<«ro on Ibal 

A'« - Iv^rfl - Al/21Cm + tt}r - A'« -• - 1))'« 

where a i» the wignificanee Iwel of tlw* toeb, A‘„ w f h»‘ Visliie *4 the ntandartiiacd 
normaliml diviate eitwrli-fl with pmltabilily «, sikI I m a fiwirtton of m, n, 
ffli, Os, ffi, ff| and the given hyiKrffwlicaJ value r»f uj ■ ■ Imping twt«l. This 
condition for the approxitnate «[Uttlity of the power funrli(*ns is reaaonably 
aceurate for the following etisf'a: « “ .fWl, m > 4; « ** .£>23, w > 3; « ■» .01, 
w > fl; o “ .tXkl, m > 7, 'I'he su’cumoy r»f tlw* aprwKrtimation irim*a«ea as m 
increa«f*8. 

Hence a value of r auch that the isvn ptiwrr fiinrlionn are approxinmtcly equal 
i* determined by the equation 

( 1 ) r(I - A*«/ 2 ((m + n)r ^ 2 li - 1 Kl/ 2 f»n ™ 1 ), 

I»ot 

A » A(m,«) I® I “> K\f 2 im — 1 ). 

Then solving (1) for the appropriate root yield* 

' - 2 («Vn) '“+<”*+ 

+ + (m + n)A + A:y2}» *- 8(m + «)i}. 

Thus the power efficiency of Scheffd'a one-sided solution to the Behrens- 
Fisher problem, for the case in which the ratio of the variance* is also known, is 
approximately equal to 

{2 4.(m + nM + A’/2+ V(2+(m+ftM4-Al>2p---S(m. + ftM 1% 

for suitable values of « and m. 

EKFERENCK8 

H) HsNaY SciiBprf, "On aoluUona of the BfthrenB-Fishsr prohlera biustidon the f-dlstribu- 
tion," AnnakofMalh.BUU., Vol. M (W3), pp, 116-44. 

[2] John E, Waisu, "Some eignifiesneo teats for the roedion which are valid under very 

general oonditiona," Annah of Math. Slat., Vol. 20 (1949), pp. 64-81- 

[3] N. L, Johnson and B. L. Watett, "Applioatioas of the non-eentral l-distrihution," 

Biomlnka, Vol. 31 (1940), p, 376. 


fiskkr’h inequa.lity 


619 


A NOTE OH FISHER’S mEQnALITY FOR BAIAHCBD mcOMPI.RTt 

BLOCK DESIGNS INCOMPLETE 

By R. C. Bo8b 

of isiatutics, Unwcrsily oj North Carohna 

1. An pxpCTintfntal ilwign in which c varieties or treatments are arranced in 
h blocks, ia caHctl a MlanrM ifwmtpkle block design if b m 

(i) Ewh block ha« exactly k treatraenl* (fc < e) no treatment occurring twice 
in the Rame bUrk. 

(ii) I'lach treritment oecum in exactly r blocks, 

(iii) Any two Irealmenta occur together in exactly X blocks. 

It ifl e«y ter see that the parametcra e, b, r, k, X of the design satisfy the rela¬ 
tions 


(LO) bk * cr 

(1.1) Hv - I) » r(k - 1). 

Al»o it ia readily seen that 


(1.2) r > X 

for otherwiffip with any given treatment every other treatment would occur in 
every block. 'Hiia would make fe ® a, and the design would become a 'randomised 
block design'. 

Fisher (IWO), ahowerl that a necesewy condition for the existence of a bal¬ 
anced incomplete block design with v treatments and b blocks is 

(1.3) b S v. 

It ia the object of this note to give a very simple proof of Fisher’s inequality. 
2. Consider a balanced incomplete block design with parameters 
(2.0) f, b, r, k, X 

and let 

(2.1) n*/»lQrO 

accortling a» tltc fth treatment doesa or dt>e8 not occur in the jth block. Clearly 

jt 

(2.2) » r 

(2.3) 

I’"! ♦ 


(t 5^ i')- 





r \ 

X r 


where denot« tlie transpose of A’. 

(2.0) dot iNN') » |r + Xiv ~ 1)| (r - X)^“‘ 

But “ kr(r ~ X)*'"’ from (1.1). 

(2.7) Oet {NN') » dot N dot N' « 0. 

Thia makesr “ X, and coalradiola (1.2). Hence the awwapUcm b <vk wrong, 
and we must have 

(2.8) b > c 

BJanERJENCEB 

[1] B. A. Fiaesa, "An examtoaliun of Uto different jw««ib)o Mlutions of n problem in in- 

complete blookfl," Annate of Sug$nu:t, JUmdoa, Vol. 10 (IWO), pp. 52"7fi. 

[2] F. Yatbs, "Incomplete raadamiecKi block*," Annat* of Eu^iot, London, Vol. 7 (1936), 

pp. 131-140, 


ABSTRACTS OF PAPERS 

(Fresmted StpUmlw t, i&49 al Bouldw at Ihe Tmlfth Summw MMinp of lh» ItulHuto) 

1. Structure of Statistical Elements. Doanb M, SnrauBY, Foundation Researoh, 
Colorado Springs, Colorado. 

RcBearoh In logical eemenUce and in practical elwncnlation hw wt forth the proposition 
that all words and Ideaa have »et form. As a eonsequenoe of this universal propoeltion 
all notions and oonoeptions in statiaticB should be acceselble to set-theoretic analysis and 
inteprotation. This paper explains the results of a preliminary analysis performed on 
statisUoal notions and oonoeptions with a view to a proper organisation of definitions and 
oonoeptions which will, it is hoped, make ifossible a better and simpler construction of 
statistics from a system of basic notions. 



ABSTRACTS OP PAPERS 


621 


2. On the Rektire Efiadencies of BAN Estimates. Leo Katz, Michigan State 
College, Eaat Ijansing, Michigan. 

J Neyman, in the Prorccdinga of the Berkeley Symposium on Mathematical Statistics 
and Probability, 1019, proved that x* minimum estimates with either of two alternative 
definitions of x* are e.ffieienl, as also are the maximum likelihood estimates. He also raised 
the question whether some of these estiraatea were hotter than others. This paper bears 
on that question. In makliiK x* minimum CBtimatca, it is often necessary to avoid small 
frequencies liy RroupinB together at least one tail of the distribution. It is with respect to 
the parameters of iheae mmUJkd distributions that the x* estimates are efficient Define 
relalwe effirsency in thejse circumstances as the ratio of the variance of an efficient estimator 
in the unmodified c»e to that of one in the modt^cd case. It is shown that, except tor a 
rectangular probability law, the relative efficiency <1 and, further, it decreases as the tail 
grouping is made wider. Fonnulae are given for the relative effleienoies of x’ minimum esti¬ 
mators for Binomial and Poinaon probability laws and some representative values com¬ 
puted to exhibit these effects. 


3. Adjustment of an Inverse Matrix Correspondkg to Changes in the Elements 
of a Given Column or a Given Row of the Original Matrix. Jack Sherman 
and Winifred J. Mobriho.n', The Texas Company Research Laboratories, 
Beacon, Now York. 

A simple computational procedure is derived for obtaining the elements b[, of a nth 
order matrix (B'l whieh i« the inverse of (A'), directly from the elements 6., of a matrix 
(B) whieh is the inverse of (A), when fA'l diffem from (A) only in the elements of one col¬ 
umn, say the Sth column. The pquallons which form tbo basis of tho computation are. 

oei “ "S'" ■'*' •* > I - 1, 2, • • • n. 

l-l 


W 

bif I" bif ^IfOrS , 

r««l 


i “ 1, 2, • • • B — 1, 5 -t- 1, • • ■ n 

j w 1, 2, n. 


Analogous equationa are derived for the ease that A and A' differ in the elements of a 
given row rather than a eolumn. 

4. On die Problem of Optimum Ckssifleation. Paot. G. Hobl, Xlniveraity of 
California at Ix» Angela. 


Let A , (f - 1,2, • ■' . fell be tl'*i prnbnbilily density function of population i and let 
Pi be the’probtthiruy Uiat population i will Ik- anmplnd. Assume oertwii difforontiability 
oonditiorm imd Titian, for known paritmeterSj tUa probability of a cor- 

reel dftwifieaHon will by nbwiHiiiis t\w ,wUich oorroapondsto oIm- 

sifyingintopopul&tinniia^ithatpnttof whort^Pi/(^Pi/f *** » )• 

If the parameteri m unknown, ft»yiuplntSeia.lly optimm 86i of eeUroatea will be 
given by the set that miairnisea a certain form k the eovariantSM. toong unoorrolatod 
ealiraales, raaidmum HknltUond estimates are seen to be asymptotically oplunum. 

If weight functions. Wu , ore inlriHluoed and the expected vriue of the loss iammimiaed, 
the same methods of proof show that the region Mi Iwcomes that part of variable space 

where S P^Wri - Wtd > 0 , (j « 1,3, • • •, fe), and that the crlterioti tor an asymptoti¬ 
cally optimum set of estimates is of the same form m the prwedlng oriterioo. 



HWtRurTf* tt»' PAfrftH 


3. Optomt Linear Prediction of Stochastic Pro-cwes wb&se Corarknce® aw 
Green’t Functioni. L Ite-rn and M A Wmjiwn'RT, llnivereity of Michi¬ 
gan, Ann Arbor. 

A mpthtw! f(} UBWwf'il, ttinttfflal vtunsnrr, !/»»•»! ijrpdirU'tn ts {nr prolili™ 

nimitftr So ibow" »f prffbrtwt! «*»! Wtrrinij Stf»Sr>4 hj? W»<'n»>r U rJW»M frwn <[»<•««> in tii&i, 
Ihe unlijwrfi mndtsinn ))»only a pari of w rmf»If»yr4, and an «u. 

tioBMy awfWmnsirm i* ll iarfiowfi ib»i *hr> siwrial staiionary «**»«» 4w()*»i»d l»y C^un- 
ninghara and Hwnrl, '‘nasidlwwi IVutt**"® m Prol.l<*m# f»f Atr Warfare" / .Hnpp, J^mat Koyd 
1£MR) i«ttfi!Tr4*i !wf«n«! lb** rorrolaibiin fiiortinn, wHl Stft'iwn in that of 

ih») pr«K!«« tk'R«Ml by t!bf« eJiMasson, sHn* f!rw»n’« ftnjirsjw of ib<' hrtmoipinr-otu 

lUffpMfid «Hjuation fnrtwHl by lotting tbo adjoint ihffrtonsiaj o|»ofnii(r of th** I^iiigflvian 
oquation ojwralo »n llnw oiwral-or nf tbia rquation Tbi« n-latiorwhip ia Aown to iwroist 
for any pbyaieally Hlabi** Immr (hffoffnjia! oquaSiirm ilnv^n by '"<shsto tKiiso." Tho ‘well- 
known wiuivalfnrf lielw-oiin !iiS**gntl and 4iff<-irr»iial rqnasionw w '.hm I'Xtrinied by uskj of 
fflioUita inle-Rrala and ttw**! lo rSfi-rt ihf aolufionr of tbr* itijpgral '•quati'ona of lln* find kind 
wbieh yield She "npliniMm’' linear predietioit. The iwn»Mti<iiiaty roeatniib' r'linaislinR of 
purely raudotn rnotion alami, a mean linear path in the proarnro of radar lype errors i» 
iroaU'd in detail. 


6. The IntefTfll of the Gaustliui Etislributien oyer the Area. Bounded by an 
Ellipse.H. H, GKRMOffD, RAND C’orpHWfiun, Santa ’NRinit'a, Califoniia. 

Thi* paper dnacrilie# thn prnpnralton of laftlpa from wbirh to nbfain the integral of a 
bivariate (JausBian tiiairibulion over the area of an rlJipae. The renter of the ellipao need 
not eoincUlc ivilh the mean of the Oauiwian diatnlwtion, nor need the axes of the ell4»» 
have any apeclal orienlalion with respeet to the ftaimnian diairihulion. 

7. Theorems on Convergeacy of Compound Distributions tirith Symmetric Com¬ 
ponents. {By title) Makia CAffrEi.ihANij, University of KanKas City. 

The purpose of Ihia paiHsr la to prwont aomn reaulia ohtaiowd when «|«ralioim of oonvu- 
luUon in Ri are concerned with a apedfio family of distrihutioiM. The coHifwund dialribu- 
Uon K(x) m F[x) ’ 0(x) ia here obtained combining any d.f. Fix) with a d.f. 0{x} under 
the reglriotion of symmetry, i.e., (J(x + A) 0(x ~ h\ •» I for any A > 0. 

A gcneralbation of Cantelll'a Inequalities will enable u« to write a preliminary theorem 
on the following upper and lo«'or bnunda: 

F{a ~h)~2 dO(0 < K(a) < Fia + h) + 2 j* dfffpl, 

A(a - A) “ 2 J dOly) < F{a) < Kia + h}+2j^ dmyK 


whore o is any point in Hi and A > 0, 

The theorem is derived assuming the Btiellles Integral, 



F(a — y) dG(y), 


is taken as a sum of three integrale connected with three convenient intervals {—«> , — A), 
(—A, A), (A, -l-«i). When the symraetno component of the convolution ia amomber of afam- 



AB8TH4CT8 OP PAPKIlg 


625 


2 r‘ 

ily of normal .li«lni.nhom. wrh w r;,fT, - J e~-V 

amotor. tho imo of C-fttKrlhV »ivo 

■' f* I 

KM - « ™ " J. ' “ 

V » 


«is an arbitrary par- 


: A «'<*■( ft) ~ Ka(a) j. 




du. 


whtTo K,(ri - * f'-**** 

Thu d.f. 18 


— r"^ ft' . a fr. f •yW which is everywhere 

uniformly oonUnumwi. For an Mburarily irnisll n > 0 . a eonvement small h and larae « 
may be fowmi wbirh will on»l'l*i tw tn prove the foilowinK two theorems: 

ThkoREM 1 tiivim wiy d-f Fist »n fti , there exists a convenient continuous d t. JC (i\ 
which for« -• « rimvergc* iwypm«*tlir«!ly and noifnnnly almost everywhere to the given 
(1 f 

Trkokkm 2; (Jivrn any di. F*t) in /fi . iherr rsiste m any continuity bordered interval 
a convcnifiH uniformly oiinv«'rK'’n' of roninsiums (unctions which asymptotically 

approach ihr riv*-!! Ft/;. 


8. Partial Suras ot the Regatlye Binomiai la Terms of the Incomplete Beta- 
' Function. (By ftBr') JtTi.iUR Ll».Ht.Kts, Htaiiftiiral EnginRering Laboratory, 
National Bureau of Htaariard#. 


In accrfilanro samjilinR n certain mif sirapip w tak^n at random from a lot of items and 
the lot i« Bcccpicil If the numlerr of dcfpctivc-iNm* >in ml pxwd n prodplermined number 
ohsrimlcristic of the asroiihna plan- Tim ttiaiistmal EftHjnwring l^stiuratory has been 
BludyinR the prolmhiUtiw tlmi a iloc-talon to a<cc*-pl or mien can be niMlp before the sample 
is wmplelfly i»«|w*et««l. Httch proWUtlitic# jwp f..und to involve cerlMu aurtm apparently 
not previously Ircatod- In this not** tU» author prov<-« a simple irtenlily connecting these 
sums which greatly faeitilales their rompa«Biu<n and shown how they may l« writloii in 
terms of the well known incomplete lscta.fiincti«>n >.i Karl Fcawon, (or which extensive 
tables Mi‘ availalile. 

9, Large Sample Tests and Confidence Interwk for Mortality Rates, {By Hlk) 
John E. WAt,«R, RAND C^irporalitm, Hwia Mnnica, California. 


In compuliflt mtirtalily ratps from irwurancr data, uml nf mvMurcmsnl used is fre¬ 
quently HssihI on numlipr of pohcivs ««r amount of »»>««,nw rather than tm lives. Then 
the death of one pprwm may twiU in veveraJ uniu of “-Kath'* with rrejwi to the invoati- 
gatiotti morvovof, Iho numlw-r of nnitn p*’r itolividaal wet vary wsUwahly. Thu* the usual 
large sample methmte of olitainiog algnificanc** »«<l n.mMonw mtervils for the true 
valucj of the mortslity rato are not appUcalde i« If the nurolmr of units 

associated with wnh per#ow in the iv*viiiftatHi« awr** km.wn. »fpurate Smu?* mniple rcsulw 
could be obtained; bowovof. delortinnation of th** nu-wlirr of uml* Mnwiated with each 
individual would roquiro an pxlronwly large aww’jnt *>f *"«rk This Mlirle prcienls some 
valid largo sample and eonlidvnce interval® C«r tW ni« which do not re¬ 

quire rautth work *ad ore rewtonaWy rflieient. The s*rfc>fc»»iiuw foMwiml Hmsu-ats in first di¬ 
viding the riaka into twenty■«x auhgroupi«« the bw*!® «{ tip* tot !< !l*-r of She Iwt name 
of the person inaurwl. Some of the groupa arc thro r^adawd uatil 10 l« 1® jubgrouiw 
yielding approximatedy llw »me tmmlmr «f awl# arc Ttw «ji»islin| of 

the total numSuT of unit* paid divided by lbs total wiw*ffr awto i« c«nput.ed 




ter r>jM!h *iv’f ofc. 

frAtt* r»ni'n»«»a pyr«n.i’i!5«a3 !»>;»■'» w«s5» r»tjuaj to titg 

trttfl of Ih'"’ of woMjihis T'Sit'O tntrtx^n tnr th"* fM** of twruf- 

itif &» t»y •pplyin^ rj'S'jJt* f-f ’'Jw-jr.** IVstls for tee 

M«Uan whtA lu*^ Vali4 l*i!«i*rr Wr/ f'lun^hUvim" i.InPial* n/ V-tih Hifii , Vol, ag 

(»,pr» ftf Stf lo tbpw 


KEWS AHB HOTICES 

M/rndm ar» !t*i«(r4 la nAtaii in t&« *»/ lA* InufUMJ^- wnw® uf jwlJffMi 

Peweiaai Items 

Mr, Fml C. Andrews will Iw* a teichinKawirtanfc in fhr HlwtwUpi*.! lialwmtory, 
Depatimnnt ti{ Matliwmti«, Univensity rd t*alifMmn» for th(‘ aPiwlwvie y»r 

Dp. Jfweph Berkuon Iw Iwn prmnotwi t« th»* nmk of Prote’SKnr in thi‘ Unl- 
vermty of Mimiwola ClriMluate fehool and Mayo Fnimdaiiim. Hp cimtinviKS as 
duel of tho Division cJ Biometry and Mwliral Statjrfjr** of «lu* Mayo CJlinic. 

Mr. Cfolin R. Blylh i« now a rtwtareh a»AK«t»nt at tVm l'nn i>r«ity of CaHfomia, 
Statistical Ijalwralory, Itorkcley, 

Mr, Clyde A. Bridgcr i« now Difeetor of tin* Bc'Hion of Htntwiiirs and Htnte 
Eegist rar of Vital Statistics for tho Divisiiai (4 Health of Mii*i«(urj, 

Mr. lAiren V. Bums, formerly with the MFA MiUirtK Company at Hprtngficld, 
Missouri, has been made Viee-Pewident and T«>hnjr>at Diitaftor of the Spear 
Mills, Inn., Kansas City 6, Mlwiuri. 

Professor Douglas Chapman, who oblaioerl lii« Ph.D, in statiatk« at the Uni¬ 
versity of California, Berkeley, has accepted an appointment a« Assistant Pro¬ 
fessor at the Univeraity of Washington in the fJepartment of Mathematics and 
the laboratory of Statistical Reacarch. 

Dr. Andrew Laurence Comrey, who received his doctor’s degree frwa the Uni¬ 
versity of Southern California last June, has acr^eptisl an awislant professorship 
in the Department of I^yohology at the University of Illinois. 

Dr. Donald A. Darling lias been appointed to an instmetorehip in the Defmrt- 
menfc of Mathwnatics, University of Michigan. 

Dr. Paul M. Denson resign^ his position as Chief of the Division of Medleal 
Research Statistics of tho Department of Metlicmn and Surgery of the Veteraufl 
Association as of July 1,1049 to join the stall of tho Qwluate School of Public 
Health, University of Pittsburgh, as on Associate Prefowir of Bio»tftti*ti«. 

Mr. Amron H, Kata has been promoted to tho poailion of Chief Phyaieirt of 
the Photographic Laboratory, Engmwiring Division, Air Material Commandi 
Wright Patterson Air Force Base, Dayton, Ohio. 

Associate Professor Itouia Qnttmann, who had been on leave for two years from 
the Department of Sociology of ComoU University conducting a reseatoh pro¬ 
gram in Israel, was invited to remain in Israel for another year to direct the 



NEWS AND NOTICES 


625 


activities of the recently founded Israel Institute of Public Opinion Researcli. 
lie is serving as Chief Consultant. 

iVlr. Ilcnie Rmest liaFoutant who was attending the University of Michigan, 
during the acajlcmic year 1948-1949 working on his doctor’s degree, has accepted 
a position os statistician for the B.T.W. Insurance Co. at Birmingham, Alabama. 

Assistant Professor’^^rome C. R. Li has been promoted to Associate Professor 
of Mathematics at the Oregon State College, Corvallis, Oregon. 

Professor H. B. Mann of Ohio State University has accepted a visiting, 
professorship and research associateship at the Statistical Laboratory at 
Berkeley, California for the year 1949-1950. 

Dr Gottfried E. Noether has been appointed to an instruetorship at New Yoik. 
University. 

Dr. G. 11, Seth has just returned from a trip to England, Sweden, Prance and 
India where he visited statistical institutions. 

Assistant Professor Andrew Sobezyk has been promoted to Associate Professor 
of Mathematics at Boston University, 

Dr. Zenon Szatrowski, formerly with the Economics Department of North¬ 
western University, has accepted an associate professorship in the School Of 
Business Administration, University of Buffalo. 

Professor Gerhard Tintner has returned to his teaching and research duties at 
Iowa State College after spending a year at the Department of Applied Eco¬ 
nomics at the University of Cambridge, England. He gave a course on Econ¬ 
ometrics at the University of Cambridge and during his stay in Europe, he 
lectured on econometric and statistical subjects in Universities at Bristol, Dublin, 
Hull, Paris, Manchester and Uppsala. 

Dr. A. E. R. Westman, Director of the Department of Chemistry, Ontario 
Research Foundation, left in September, 1949 for England where be is visiting 
industrial research laboratories and engaging in studies in the Department of 
Physical Chemistry, Cambridge University. He plans to return in June, 1950. 


Word has just been received here of the formation of the New Zealand Statisti¬ 
cal Association. The initial meeting was held m August, 1948 at Viotona Uni¬ 
versity College. The officers are: J. T. Campbell, President; I. D. Dick, Secretary. 
It is planned to hold one formal meeting a year at first with the hope of increasing 
this later. The main interest in statistical work in New Zealand has been bio¬ 
logical, but there is scope for considerable extension to industrial, educational and 
economic fields and it is hoped the fonnation of the Association will assist in this 
extension. 


New Members 

Th6 following ■persons have been elected to Membership in the InUitule 
(June 1, 1949 to August SS, 1949) 

AI-Doori, Yotmls A., Student at the University of California, 19JS Henry Street, Bcrkele'f,- 
California, 



MKWSt -Himvm 


Wi'bifi Itebuft A., A It in-f fol}? ; AM* rWi/gmw. 

Bala, Clttttld® Afl|®Ilca, I’h f}, (!'«!»' <fi IWnrtw, ATgr$iifm- Vnitmily ol 

BttPlw* Ainw, f'-'trn 4* Uvrn->> Atf'^n. Afg*"fiU^ 

04Jirtet, WwlnE„ Ph J) sTniv . 1--JitiJMapsw. |■'a|n}<»w!^*a N«rt}iT«slin|. 
w,! ft>lwwl, PalmPWiflfi 5'«»l*.rs<S 

BftuflUf JaiwmB., 0ijt. lyi. <A|pSt»<!«rn fsiiv ' »»! Jiietih^mrilKM. %VweMl(8Teeb. 

al«I (VIIe|{ii,Tigte*« Hil! 3?i, >* H W , Aw«ra!m. 

Eiirtlaft HamAB 0<, Ph l>. <riwnl*n4A«* I'****' ' IfPrturoT m bilaiMiiiiw*, Ib(if»wtmenl of 
StslfeUta,BnivttwiJy fVlIpjf*'. W c' S. I.hslsa'f 

tnungl,ErtcE^M.A. (Qmw»‘* rnstf., Kifijit;»l*«nKransifa'i Tvarliintj AatJ^tanS and Oraduule 
Student, Wiiparttnent nf MaSlbptnaiitv!, Pnsvpftjisii wf f'aiJfinfnii-i ni I« Atigj^lps, 
Angttw, C’aSfurnm. 

KeUy, J&liii P., ffeniw Tetfinira.! Kn^mw, f’a3'l<i«Ie nod ftwHac*! t'wiwraUon, 

Oak Ridp', Tennwsuw, P O Hax i7S, Sanfm, r»'nn< f(ff 

P«rel, CrtsStia P., M.S, (Pniv of MirhigaU! Iwuuri'nr. Ih>j>tTrU}sifsnt of Matlinnmlita, 
Oniwratty of the Philippinwi, Manila, I* 1 

Piitllpson, CAfl 0., n.Sp. (litilv of Sl»?rKlirjlfn) Artwarj of Fnikf'. SamarJ^’N*. Ynypna^^n 
S, njuruMta, Hwf4en. 

Porier, Robnrt A., Ph,I). CK.f*. Slain C'ciHbisp, Ualwsh. 'N' f" ; S» ninr Math»*maiipian, Uni¬ 
versity of CWeARO, I7HII Lmugfrlhiii Amnue , Ito'ntttui^l, 

Eippe, 0ayl« 0,, M.A. {Univ. of Jfohr.) Student,Trarhing IVII<m , f lonarlment of blalhe* 
malice,University ofMlfhlRiin, JFi/fw Wmo. Iftrbigwn, 

Regers, Robert L,, A.11. (tiniv. nf (^alif j Student at I'riivrTWty **f (‘nlifornin, KwU i, 
Hox74, Dmki Amnue, (hlrtiji.Cidi/ornm. 

Rojr, S(uti*ren<lr« H„ Mile. fCalculta, tJnIv.i Iloed of lJr)*^rlrapr*t «f Stnlietire, (tslculln 
University nnd AeHktnnl Hiwlor, Indian Mtalietrrel Irunnmp <naw on Irsavo) P.O. 
So* t6S, Chaptl Hill, North (‘arolim. 

SeveytRosttnuiry, M,B.A, (Ifniv. of Wise.) fSmduaU’ AwwnanlnndStudetri, UnivemUy of 
Wteooruiin, MS/S Norurntd Pine*, Modison S, 


REPORT ON THE BOTJLDER MBBTINO OF TOE INSTITOTTE 

The Twelfth Summer Meeting of the Iimlitute of Mathematical Statwtics 
waa held at the tJniversity of Colorado, Boulder, (^olorarlo, Monday, August B 
through Thursday, September 1, 1949, The meeting wm held in conjunction 
with the summer meetings of the American Mathemaiical Society, the Mathe¬ 
matical Association of America, and the Econometric Society. The meeting was 
attended by the following 79 members of the Institute: 

S.,P. Agarwid, It. I,. Anderson, T. W. Anderson, V. I,. Aadcrwin, K, J. Arnold, K. W. 
Bstftnkitt, G. A. Bennett, Agaoa Berger, E. E. Blanohe, A. II. Brrwker, J. C- Brixey, Josn 
Bronfoobronner, J. II. Bushey, II, G. Carver, Herman Chernofl', K. I„ Chung, A. 0, Clark, 
E. P. Coleman, 14. I„ Crow, J, H. Cnrllas, W, J, I>ixon, J. L. Drmb, Aryeh DvoreUky, 
H. P, Evans, W. D. Evans, W. T. Poderer, William Peller, C. II. Flsebor, J. H. Frame, T. 0. 
Pry, II. M. Gohman, 11, H. Oermond, R. E. Greenwood, II, T. Guarrl, P, R. Ilalmos, J. h. 
Hodges, P, 0. Hool, Harold Hotelling, J. M, Howell, G. G. Hurd, C. A. Uulehitison, Irving 
Kaplansky, Kats, H, S. Konljn, T. G. Koopmans, Q. M. Kuaneus, II. X>, I>air»n, D. H. 
I^eavens, 8, B. Littauer, II, B. Mann, Jacob Matsohak, F. J, Mosaye, Ilorothy J. Morrow, 
Jersy Neyman, M. L. Nordon, J, I, Northam, E. G, Olds. R. P. Peterson. 0. B. Price, 
Mina S. Rees, P. R. Rider, F, D, Rigby. Herman Rubin, L. J. Savage. Elisabeth R. Soott, 



UBI'QUT ON THE BOULDER MEETING 


627 


I. K. Soga.1, Esther Keiflen, Jack Sherman, W. B. Bitnpson, Milton Sobel, D, M Studlev 
B. R. Ruydam, A. Cl Swanson, James Templeton, R. M, Thrall, J. W. Tukev AK,.„a!!v! 
Wahl, John Wiahart, H. H. Wilks. 

The Monday afternoon seaaion was devoted to invited addresses with Pro- 
fesBor I^conard J. Savage of tlie University of Chicago presiding. The attendance 
was approximately fifty. Professor J. L. Hodges of the University of California 
preaentwl a paper, Some Problems in Point Estimaiion, and Professor W. T. 
Federer of Cortiell University presented A Comparison of the Proporiionakty of 
Covariance Matrices. 

On Tiiesfiay Morning the Institute, the Mathematical Association of America, 
and the Econometric Hociety held a joint symposium on Maihemaiical Training 
for Social Hcimtists, Professor Jacob Marschak of the Cowles Commission lor 
Research in Economics presided. The attendance was approximately one hundred 
fifty. The participating speakers were: Professor R. L. Anderaon of North 
Carolina Htate. College; Professor T. W. Anderson of Columbia University; 
Professor (5. C. I'lvans of the University of California, Professor F, L, Gnffin 
of Reed Clollcgc; Professor Harold Gulliksen of Educational Testbg Seivice, 
Professor William Jaffd of Northwestern University, Professor Harold Hotelling 
of the University of North Carolina; and Professor G, M. Kuznets of the Uni¬ 
versity of California. At the end of the session the following resolution was 
adopted by llimc in attendance at the meeting: 

Members of the Mathematical Association of America, the Institute of Mathematical 
Htfttisliea, and the Ecuncimetrio Hocioty assembled in a joint session in Boulder, Colorado, 
on August 30, IfWfl, are of tlie opinion that offioors of those sooieties should study the 
need for better mathomatieal training of social scientists, and the ways and meSris to 
improve mathomatieal proparatirm of social scientists, and that such a study moy be 
moat cffeoUvcIy oonducted by a joint committoo, possibly in co-operation with other 
interealod soeiclica, ami in closo touch with the Social Science Research Council, the 
National Ilesoarrh Council, or other national Viodios oonoomod with general education 
and research. It i« suggested that this committoo report the results of its deliberations 
at the next joint meeting of tlio original participating societies 

The two joint ecasions of the Institute and the Econometric Society were 
devoted to a Bymposium on SLaiislical Inference in Decision Making. Professor 
Jerzy Neyman of the University of California presided on Tuesday afternoon. 
The attendance waa approximately eighty. Professor Aryeh Dvoretzky of 
Hobrew University, Jenmalem preaented Decision Problems and Professor Abra¬ 
ham Wald of Columbia University presented Some Recent Results in the Theory 
of SkUislical Decision Functions, f)n Wednesday Morning, under the chairman¬ 
ship of Professor Wald and an attendance of approximately seventy-five, the 
following papers were pr^cntotl: Remarks on a Rational Seleciim of a Decision 
Function by Professor Ilerman Chomoff of the Cowles Commission for Researc 
in Economics; Psychological Probabilities by Professor Leonard J. Savage; an 
CompUilc Classes of Decision Functions for Some Standard Sequential and Non 
segncnlial Problems by Milton Sobel of Columbia University. 



iftlbM IV f\ppr'»%5mat'i*!y wv#Biy-g^ 

'Hii' ‘0vtv iirmtiV*!' 

t s%wfw^ «f 

Mr. 0IMW# M, (Vt^nih, Mfini^p. 

3, (|« ftf ftluft* tjf MA ^ 

l*wfW»f fCAU, Mirfc*'*** C'Till*!*' 

S m tm**» W tif**fffieif>Ati>4! J« r%t)%p(t,* jss M!>nmU nv/1 

C*#}«w<s w« few ■«/ ?.V ttrtpitiid ¥#}«? 

Ih', 4»k Sb'fBWS «til Ww Wj«|fir(Piii I M.-tnwiM ITj-j* Twysn C^sapwiy ItiWireli 
lAl»wteri««, R*'4 «!ii}, Nw Ytiih " 

A, On M4m Opimnm n-tmp^im 
f*mf«i»r l*'will (5. IfiwJ, ihmniif al Ii<f« 4#cf!«s 

S. Ofthimt f/ fy-irtm '.r^'w f'wfimufa #w fkfm'i Fhhk- 

(am 

Pojfwipir f*. J, W}jtt,w4l*T M A W.w4H»y,rM»»'f»i<v«4 AtK!>wan. 

S, fhg {nk§fnl i/f fAj* Ottt/MPiia. «»*>■ fV 4'f'K» m Mkp«, 

Dr. II. II, K*w»l f ,‘^wu Alwfra, ('ahf«rtiW 

7, ffcwriw « «/ ftfinAaiww f^yi^metrir Cf/mpumniJJi. 

(By uUel 

Dr. Mam t!wtriJa«i, r«iv«»wiif »f 

8, fttirp HmtfU m4 CmU*wi« himi)k (By UIJbI 

Df. h E, WaWi, Uafttl isaM. Mwaira, t'aJVreia 

9, Partial J^bwi <»/ rt* Nuptitu BiSitm*! ra T«»»ti *>/ iks Mfk fmiUn. (Bv 

tKla) 

Dr. Jfoliw lirttleto* Katioasl Baf«ti «f liKifi4a»4r. 

Ott Tliuiwlny aftemoBH Prdfwior Mxy NVym&ri pfwtitrfi lh« ^tfrontl Eiet® 
Mettkortnl « Cmmlml e/ Ar* /i«rar fitriithiral in 

dm (hnml Cm 9 / Harrtld flolellmg prtwirW and the 

attendance ww approximalely lihv. Dr. R. R. Rom, Jr. erf Malhewtiml Ri^iwi 
presented an inviUnl addrrii.’e fh’ ReprmnMim #/ PrMkiitM l>iMfdmiim % 
Oltarlidr Series. 

The Institute apouHored a bwr prly on l\ifiidi^ »ad on Thursday 
eViming a fry was held on Fls^teff Maualwn, 


H4Bi»T.GnAnn 






